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A NEW APPROACH TO THE PROBLEM OF CON- 
CEPTUAL THINKING IN SCHIZOPHRENIA 


ROBERT W. ZASLOW 


DEPARTMENT OF PSYCHOLOGY, UNIVERSITY OF CALIFORNIA’ 


AND 


VETERANS ADMINISTRATION HOSPITAL, PALO ALTO, CALIFORNIA® 


ISTURBANCE in concept formation 

has long been recognized as the most 

important feature of schizophrenic 
thought disorder. This disturbance is marked 
chiefly by fluidity of thought processes and a 
fluctuation of conceptual boundaries. Where 
intellectual deterioration in the schizophrenic 
patient is manifested, the structure of his con- 
cepts is generally labile, meaning shifts in a 
unique manner from one common conceptual 
characteristic to another, and the content of one 
conceptual realm is often mixed with that of 
others. 


It appeared that these basic features of 
schizophrenic thinking might be thrown into 
relief by means of a sorting test that presented 
concepts in the form of a continuum. Such a 
test was devised, and it has been used in a 
preliminary investigation of hospitalized schiz- 
ophrenics. The results of this investigation in- 
dicate that the test, which permits objective 
scoring of the subject’s performance, can de- 
tect the more subtle aspects of conceptual con- 
fusion, measure a conceptual span and provide 
an objective index of rigidity or fluidity in 
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thinking. 

The present study takes as its point of de- 
parture the work of those investigators, who 
through the use of sorting tests such as the 
Hanfmann-Kasanin (Vigotsky) Test and the 
Goldstein-Scheerer Color-Form and Objects- 
Sorting Test, have noted in schizophrenics an 
interpenetration of conceptual boundaries and 
a general looseness of the conceptual span [2, 
3, 4, 6, 8, 9, 10, 12]. These tests, however, 
present the subject with dichotomous groupings 
in which there is no overlap of the conceptual 
boundaries. As an example we may consider 
the Goldstein-Scheerer Color-Form Test: 
when the triangles are grouped in one category 
the result may be considered in terms of tri- 
angularity and nontriangularity, the circles and 
squares representing nontriangularity ; and the 
same principle obtains for the circle and square 
categories. The striking differences among the 
forms make the test an easy one for all but the 
most disorganized schizophrenics. 


The Hanfmann-Kasanin Test, on the other 
hand, is much more difficult, many normals be- 
ing unable to perform the required task suc- 
cessfully. This difficulty of the test is due to 
two important features of the test materials: 
(a) the variety of different characteristics of 
the blocks used presents a confusing test situa- 
tion in that the possibilities of categorization 
are very numerous and (b) the test presents a 
dificult problem in abstraction, since the prin- 
ciple used for successful solution represents a 
fusion of two basic concepts, height and size. 

The test used in the present study was de- 
signed to permit more sensitive detection of 
conceptual confusion than can be achieved with 
the tests discussed above. The essential feature 
of the present test is a continuum starting with 
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a true triangle and ending with a true circle. 
This continuum extends over a series of four- 
teen designs that vary in degree of circularity or 
triangularity. Each design is printed on a sep- 
arate card and each, when properly placed, has 
a definite position on the continuum. 


Concepts based on a continuum require a 
considerable degree of subjective judgment. 
The continuum is equivalent to a_ process 
(triangularity to circularity); and _ schizo- 
phrenics may be particularly disturbed in 
their approach to process concepts. In addi- 
tion, the continuum fulfills the condition of a 
dimensional field composed of three interlock- 
ing concepts (triangle, circle, and middle area 
or fringe). Angyal contends that the schizo- 
phrenic fails to maintain the links of a dimen- 
sional field in which the elements are subordin- 
ate and determined by the field conditions as a 
whole [1]. 

The present test, then, seems to offer several 
advantages over the current sorting tests of the 
Goldstein-Scheerer or Hanfmann-Kasanin 
type. These advantages are: (1) precise mea- 
surement of the subject’s conceptual span; (2) 
reduction of the influence of superior intellect 
and education; (3) simplification of the task 
so that extraneous confusion is minimized ; and 
(4) greater sensitivity to conceptual confusion 
because of the nondichotomous arangement of 
the overlapping concepts. 


METHOD OF THE EXPERIMENT 


Subjects. The test was administered to 
twenty-four diagnosed schizophrenics randomly 
selected from a ward at the Veterans Adminis- 
tration Hospital, Palo Alto, California. All of 
these patients had been institutionalized less 
than 7 years, the majority less than 5 years. 
The educational background of this group was 
as follows: twelve patients had attended gram- 
mar school, ten had completed high school, two 
patients had attended college for one year. The 
types of schizophrenic reaction found in this 
sample may be divided into paranoid and non- 
paranoid, the majority falling into the para- 
noid classification. 


A control group of sixteen patients, compar- 
able in age and educational background to the 
schizophrenic group, was selected from a gener- 
al surgery ward at the Veterans Administra- 
tion Hospital, Oakland, California. The only 
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Fic. 1. Designs used in the test of conceptual 
thinking. The designs, when placed in one row, 
form a continuum extending from perfect triangu- 
larity to perfect circularity. In testing, each design 
is drawn on a card 2 inches square. The side of 
the equilateral triangle is 114 inches, and the di- 
ameter of the circle is 134 inches. 


criterion used for selection was the patient’s 
ability to leave his bed in order to take the test. 
All of the subjects in both the control and 
schizophrenic groups were veterans of World 
War II, males between twenty and fifty years 
of age, the majority being less than thirty-five 
years of age. 

Test Procedure. The test procedure consists 
of four major parts: 


Part A. The cards are scattered randomly 
and the subject is requested to arrange them in 
one row in the best possible manner. 

Part B. The true triangle and circle are 
placed at some distance from each other and the 
subject is asked to arrange the remaining cards 
(which are scattered as in Part A) along a 
continuum from triangularity to circularity. 


Part C. The cards are correctly placed on 
the continuum and the subject is requested to 
indicate where the triangles and circles end; 
that is, how many cards belong to the triangle 
and circle groups. This establishes conceptual 
boundaries of a certain magnitude for these 
two concpts. 

Part D. The subject is then requested to re- 
move those cards which are neither triangles 
nor circles. This procedure measures the ability 
to maintain the conceptual boundaries previous- 
ly established. 


RESULTS 


Part A of the test indicates that the continu- 
um is composed of three basic concepts and 
that it produces essentially the same results as 
the Hanfmann-Kasanin Test [8] ; that is, there 
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emerge three conceptual levels, though here 
there is the additional feature of a numerical 
score that well represents the conceptual level 
of the subject. A performance on the highest 
level orders the randomly placed cards along a 
contir “um extending from triangularity to cir- 
cularity or vice versa. The intermediate level 
of conceptualization may be described as an at- 
tempt to group the cards in accordance with the 
three basic concepts that make up the continu- 
um, but with no definite order within the 
groups, or of the groups on the continuum. 
The primitive level is indicated when the sub- 
jects do not order the cards as loose conceptual 
groups but, instead, arrange them as simple 
pairs or patterns, or with no consistency at all. 
Both groups of subjects perform on all three 
levels, but the schizophrenic group contains a 
higher percentage of cases performing on the 
intermediate and primitive levels. 

Part B yields a more significant differentia- 
tion between the two groups. A critical score 
of twenty-five separates fifty-four per cent of 
the schizophrenics and one hundred per cent of 
the normals. A score on this part of the test 
represents the sum of the deviations squared, 
where a deviation is the difference between the 
proper position and the actual placement of a 
card. 

Part C yields several highly significant scores 
which may be used as diagnostic indicators. 
Using the normal group as a criterion, the 
boundary of “triangle” may be said to lie be- 
tween cards 3 and 5. Ninety-four per cent of 
the normals and 17 per cent of the schizophre- 
nics establish this boundary within this range. 
Two schizophrenic groups may now be distin- 
guished : 

1. Thirty-seven per cent of these patients 
choose cards 1 or 2 as the triangle boundary, 
and are thus more rigid or constricted in their 
conceptualization than are the normal subjects. 

2. Forty-six per cent of the schizophrenics es- 
tablish the boundary at a point beyond the nor- 
mal range, and are thus more fluid in their con- 
ceptualization. The “fluid” group shows what 
Rapaport and Schafer [10, 12], regard as an 
essential characteristic of schizophrenic think- 
ing, i.e., looseness of the conceptual span. 

The boundary of the “circles” is placed at 
cards 11-12 by the normals. Here we find 
ninety-four per cent of the normals and forty- 


three per cent of the schizophrenics. The tri- 
angle concept distinguishes the more clearly be- 
tween normals and schizophrenics, probably be- 
cause the triangles merge into the middle area 
of the continuum by finer gradations than do 
the circles. 

The size of the fringe or middle area pro- 
vides another significant score: ninety-four per 
cent of the normals and only twenty-nine per 
cent of the schizophrenics establish a middle 
area which contains no less than 4 and no more 
than 8 cards. Only the schizophrenics reduce 
the middle area to zero, or permit the circle and 
triangle concepts to overlap. 

Part D, which measures the instability and 
interpenetration of boundaries, produces very 
significant results. While eighty-eight per cent 
of the normals maintain the boundaries they 
established in Part C, only eight per cent of 
the schizophrenics are consistent in this respect. 
Sixty-three per cent of the schizophrenic group 
show direct interpenetration by actually break- 
ing the boundaries previously established. 


DISCUSSION 


In general, Angyal’s contention that schizo- 
phrenic thinking shows impairment of system- 
connections rather than in relationships is con- 
firmed [1]. The schizophrenic cannot unify 
the three concepts mentioned in the present test 
into an integrated system, although he is able 
to differentiate between the triangle and circle.* 
His conceptual boundaries prove to be un- 
stable, and this instability is magnified when he 
is faced with the subtle gradations along the 
continuum, gradations which in effect represent 
an attenuation of the boundaries of a triangle 
concept and of a circle concept. 

The range of similarity that the schizophre- 
nic treats as functionally equivalent tends to 
be either constricted to an abnormal degree or 
diffusely expanded due to overinclusion of es- 
sentially dissimilar objects. This would appear 
to be what is involved in much of the bizarre- 
ness or poor judgment so characteristic of schiz- 
ophrenic thinking. A constricted conceptual 


3Angval describes dichotomous conceptual groups 
as semistructured organizations which produce 
pairs or relata rather than unified dimensions. Car- 
rying the point further, he maintains that the 
schizophrenic is primarily disturbed in maintaining 
the links of a system or dimensional field rather 
than in relationships between relata. 
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grouping may be considered as a sign of rigidity 
or “concreteness.” An expanded conceptual 
grouping, on the other hand, may be considered 
as a sign of that fluidity which is believed to 
result from the perseverating tendencies of the 
schizophrenic. Fluidity tends to contaminate 
a concept and rob it of its essential meaning. 

By defining fluidity and rigidity as differen- 
tial ranges on the continuum, an artificial dich- 
otomy is avoided. Fluidity and rigidity may be 
regarded as opposite extremes of the same pro- 
cess ; the difference between the two may be ex- 
pressed in quantitative rather than qualitative 
terms. Normal conceptual thinking involves a 
balanced flexibility that avoids being too fluid 
or too rigid.* The present test appears to be an 
appropriate instrument for measuring this flex- 
ibility. Finally, the present approach seems to 
resolve an apparent contradiction in Gold- 
stein’s use of the term “concrete behavior,” for 
he would call both of the present groups of 
schizophrenics “concrete” in spite of the fact 
that they fall on opposite sides of the normal 
range [6, 7]. 


RESEARCH POSSIBILITIES 


It should be rewarding to use the present 
test in the study of the conceptual span which 
Rapaport [10] and Schafer [12] emphasize in 
their clinical analysis of various sorting tests. 
These authors speak of the essential looseness 
of the conceptual span of schizophrenics and of 
the narrowness of the span in depressives. The 
test proposed in this paper suggests the possi- 
bility that the conceptual span can be rigorously 
defined and quantified. 

Important differences among schizophrenics 
themselves could be explored by this test. For 
instance, paranoid schizophrenics and nonpara- 
noid types should react differently, and it may 
be possible to separate “process” from “reac- 
tive” schizophrenics. Differential diagnosis be- 
tween schizophrenia and psychosis due to or- 


*It is possible that rigid and fluid conceptualiza- 
tion on the continuum is the result of a basic in- 
tolerance for ambiguity. The rigid person avoids 
ambiguity by remaining close to the true triangle 
or circle. The fluid person handles the ambiguous 
middle by overgeneralizing the triangle or circle 
concept. In the normal, however, the ambiguous 
middle is appropriately organized into a middle or 
“fringe” concept. This view is suggested by Dr. 
Else Frenkel-Brunswik’s discovery of a general in- 
tolerance for ambiguity in the social perception of 
ethnocentric subjects [5]. 


ganic brain damage may be aided by the test. 
There is a strong possibility that this test corre- 
lates highly with form-level and other features 
of the Rorschach. 

The preliminary nature of this study should 
be emphasized. The results suggest the possi- 
bility that the test may be a valid and usable di- 
agnostic tool for clinical use. The general struc- 
ture of thought disturbance may be fruitfully 
explored, not only in relation to the various 
clinical entities, but in relation to the ego mech- 
anisms of the individual. A research project is 
now being designed for the purpose of estab- 
lishing norms that may be diagnostically useful 
in differentiating the following clinical groups: 


1. Schizophrenics (paranoid and nonpara- 


noid ). 
2. Depressives. 
3. Patients with organic brain damage. 
4. Neurotics. 


A large sample of normal controls must also 
be tested in order to establish the validity and 
reliability of criteria used to distinguish the 
normals from the clinical group. In addition, 
the reliability of the test should be determined 
by a retest of the various samples used. 
Received November 30, 1949. 
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DELUSIONS IN SCHIZOPHRENIA AS A FUNCTION 
OF CHRONOLOGICAL AGE 
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WESTERN PSYCHIATRIC 


NVESTIGATION of a possible relation- 

ship between the structure of delusions in 

schizophrenia and the age of onset of this 
illness has received little more than incidental 
attention from workers in the field of the be- 
havior disorders. Yet it is apparent that demon- 
stration of such a relationship would be of in- 
terest to theorists concerned with the formula- 
tion of the dynamics of this disorder. 

While it is not logically valid to attribute be- 
havioral changes to the mere passage of time, 
the events, experiences, and perhaps structural 
changes which are concomitant to this dimen- 
sion are reflected in behavior and the admission 
of chronological age as a variable in psycholog- 
ical research is acceptable after noting this 
qualification. In the field of topographical psy- 
chology for instance, factors such as “rigidity” 
have been shown to be associated with chron- 
ological age [4]. Likewise personality traits, 
ability to learn, cognitive structure, etc., have 
been related to this variable. 

In the field of psychopathology, numerous 
workers have interpreted the depressive, intro- 
punitive ideas occuring in involutional melan- 
cholia and the depressions of later life as a con- 
sequence of the decline in health and of the re- 
alization that one was not likely to attain cer- 
tain longtime ambitions and goals [2]. In 
schizophrenia, however, clinical observation 
suggests a quite different age-ideation associa- 
tion, for it is in this disorder that one finds fre- 
quently a great deal of self-criticism, intropuni- 
tiveness, and conflict in the younger group 
which seems to be absent in the older group. 
This suggests the possibility of rather marked 
differences in personality structure between 
patients who develop involutional psychoses or 
depressions in middle age and those who devel- 
op schizophrenia, and points to the need for re- 
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search designed to clarify these differences. The 
first problem, however, seems to be the need 
for ascertaining whether differences actually 
the delusional structure of two 
groups of schizophrenics of widely disparate 


do exist in 
ages-of-onset. 

A search of the literature on delusions has 
indicated a paucity of material bearing on the 
subject of age and delusions. Klaesi [3] sug- 
gested that certain forms of delusions are a 
function of age and pointed to the frequency of 
delusions of being robbed and poverty-stricken 
in the aged, to the infrequency of (pathologi- 
cal) delusions in children, and to the frequency 
of “delusions of suggestion” in young men and 
older women. 

The present study attempts to obtain quan- 
titative evidence of the possible differential fre- 
quency of delusion-type in two groups of schiz- 
ophrenics which differ in chronological age. 


METHOD 


Two groups of schizophrenics, one composed 
of patients under twenty-five years chronologic- 
al age, and the other of patients over forty 
years of age were compared with respect to type 
of delusions exhibited on admission to a mental 
hospital. The comparison was effected by chi- 
square analyses of the differences in incidence 
between the two groups for each of five types of 
delusions described below. Other analyses 
which used age and sex in combination were 
made and are reported. 

The Sample Population. Case records used 
in the present study were taken from the files 
of the Western Psychiatric Institute. The 
avowed admission policy of this hospital, du- 
ring the period in which the records used were 
obtained, was to limit admissions to patients 
whose psychoses were of less than three years 
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duration and who had had no treatment else- 
where. It is recognized that there are great dif- 
ficulties in determining the exact duration of 
patients’ illnesses and further that exceptions 
are frequently made to admission criteria. Care 
was taken in the present sample to select only 
first admissions for inclusion, and patients were 
eliminated if there was evidence that their psy- 
choses were of more than three years duration. 
These controls were necessary to avoid contam- 
inating the data by using schizophrenics in the 
older group whose psychoses were of greater 
chronicity. If appreciable differences in the du- 
ration of illness had existed in the groups, dif- 
ferences in delusional thinking could have been 
attributable to this factor. As has been indi- 
cated, every effort was made to control for this 
possibility in selecting cases for the present 
study and it is felt that such control was rela- 
tively effective. However, final confirmation of 
the findings is dependent upon independent 
replication. 

The Classification of Delusions. The classi- 
fication of delusions is described in detail with 
illustrative case-material and estimates of re- 
liability in another study [1]. The classifica- 
tion grew out of a content analysis of materials 
abstracted from case records. The five principle 
types of delusions include: (a) bizarre and de- 
personalization, in which the patient believes 
that he, or the world, has changed in some 
strange manner, and is accompanied by bizarre 
notions concerning the change; (b) self-con- 
demnatory, in which he feels that he is worth- 
less, ugly, etc., has incurable diseases, is dying, 
is responsible for the troubles of the world; (c) 
persecutory, in which he feels that others are 
plotting against him, are trying to harm him, 
etc.; (d) wish-fulfillment, in which he believes 
something is true because he needs so strongly, 
oi wants so desperately for it to be true—this 
type of delusion is usually sexual and is distinct 
from the grandiose in that it frequently in- 
volves protestations that the subject is the un- 
willing recipient of proposals, indecent sugges- 
tions, etc.; and (e) grandiose, in which he is 
famous, omnipotent, wealthy, talented, etc. 

It was found that nearly every schizophrenic 
delusion could be classed in this system, and 
that reliability for other clinicians was uni- 
formly high. 


Treatment of the Data. The two disparate 


age groups were compared with regard to inci- 
dence of each type of delusion in one degree of 
freedom chi-square tables. Chi-square computa- 
tions were made incorporating Yates’ correc- 
tion. Further similar analyses were made which 
examined for possible differences between older 
men and younger men, and older women and 
younger women. 


RESULTS 


Table 1 summarizes the frequency of inci- 
dence of each type of delusion for the groups 
used in the study. Table 2 summarizes the 
positive findings. 








TABLE 1 
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RELATIONSHIP OF DeLusIONS TO EXTREME 
Ace Groups 











Com- 
Group Delusion pared Greater 
With orLless © 4 
All young Bizarre and All old + 2.60 «16 
Depersonalized 
Young men Bizarre and Old men ~_ 2.89 .10 
Depersonalized 
All young Self-condem-_ All old + 4.77 06 
natory 
Young women Self-eondem- Old women + 4.91 .065 
natory 
All young Persecutory All old 5.93 .02 
Young Women Persecutory Old women — 2.54 .15 
Young men Persecutory Old men — $3.88 .05 





From these findings it will be evident that 
there is a relationship between the age of onset 
of schizophrenia and the nature of the delusions 
exhibited by such patients. The differences are 
most trustworthy in the persecutory type of de- 
lusion where it is found that the older group 
has the greater proportional incidence of this 
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These data suggest that there may be dif- 
ferences in the personality structure of patients 
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PERSONALITY AND EMPATHY 


ROSALIND F. DYMOND 
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N previous articles the author has discussed 

the concept empathy and outlined some 

rudimentary attempts at measuring em- 
pathic ability [2, 3, 4]. In this paper some de- 
tailed results of a revised test will be given 
(Rating Test B), and the personality patterns 
of those who showed very high ability, as mea- 
sured by this test, will be compared to those 
whose scores ranked them as having very low 
empathy. 


EMPATHY AND RELATED CONCEPTS 


The term “empathy” itself, presents some 
problems since it has been used in the literature 
with a variety of meanings. Also, other terms 
have been used with the same or very similar 
meaning to that which will be used here, name- 
ly, the imaginative transposing of oneself into 
the thinking, feeling, and acting of another. 
Some of the overlapping terms which must be 
distinguished from empathy are: sympathy, in- 
sight, identification, and projection. 

Some forms of sympathy seem to be based 
on the empathic process. As Mead says: 


The attitude that we characterize as that of sym- 
pathy springs from this same capacity to take the 
role of the other person with whom one is socially 
implicated. Sympathy always implies that one 
stimulates himself to his assistance and considera- 
tion of others by taking to some degree the attitude 
of the other person whom one is assisting. The 
common term for this is “putting yourself in his 
place” [9, p. 366]. 


Koestler, too, has followed this reasoning in 
his discussion of the two concepts. 


Empathy can be described as a process of “pro- 
jection” or “introjection”; both are metaphors re- 
ferring to the experience of partial identity be- 
tween the subject’s mental processes and those of 
another with the resulting insight into the other’s 
mental state and participation in his emotions. Em- 
pathy becomes sympathy when to this mental res- 
onance is added the desire to collaborate or help 
[8, p. 360]. 
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Empathy is viewed, then, as a neutral pro- 
cess. It may lead to positive feelings and closer 
social relations, as when it results in sympathy, 
but this is not necessarily the case. 


Insight may also be thought of as a product 
of the empathic process. Insight into oneself 
seems to require the ability to stand off and 
look at oneself from the point of view of 
others. In order to see ourselves as others see 
us, we need to structure the situation from 
their perspective or transpose ourselves into 
their thinking and feeling. [nsight into others 
also appears to be dependent upon the ability to 
take the role of others. More and more clini- 
clans are coming to accept this position, partic- 
ularly those of the Rogers school of client- 
centered therapy. In a quotation from Raskin, 
Rogers states: 


As time has gone by we have come to put in- 
of the 
relationship because it is more effective the more 
completely the counselor concentrates upon trying 
to understand the client as the client sees himself 
[11, p. 86]. 


creasing stress on the “client-centeredness” 


Identification appears to be a very special 
kind of role taking; one that is more lasting, 
less frequent, and more emotional than is im- 
plied in the term empathy. As defined by 
Healy, Bronner and Bowers, identification 
implies : 


The unconscious molding of one’s own Ego after 
the fashion of one who has been taken as a model. 
Primary identification is the earliest expression of 
an emotional tie with a person. The distinction 
between (primary and secondary) 
and object-love lies, according to Freud, in a dif- 
ference between what one would unconsciously 
like to be and what one would like to have [6, p. 
230]. 


identification 


There is no implication that one would un- 
consciously like to be the other person in order 
to empathize with him, nor does empathy nec- 
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essarily imply any emotional tie with the other. 
Rogers makes this distinction rather clearly: 


. the experiencing with the client, the living 
of his attitudes, is not in terms of an emotional 
involvement or emotional identification on the 
counselor’s part, but rather an empathic identifica- 
tion where the counselor is perceiving the hopes 
and fears of the client through immersion in an 
empathic process . . . [11, p. 86]. 


Projection seems to be an antithetical process 
to empathy since projection involves the attri- 
bution of one’s own wishes, attitudes and be- 
havior to some thing, or some one other than 
the self. If projection is involved, therefore, the 
thoughts and feelings of the self are attributed 
to the other rather than those of the other be- 
ing experienced. The individual who attempts 
to understand the behavior of others using pro- 
jection as the mechanism, assumes that “since 
this is how I would feel if I were in his situa- 
tion, this is how he must feel.” Predictions 
based on projection, therefore, may or may not 
be accurate but one runs the risk of distorting 
reality by impressing onto others one’s own 
meanings. 


Proceeding from the aforementioned defini- 
tion of empathy (as the ability to transpose 
oneself into the thinking, feeling and acting of 
another), it is obvious from a common sense 
point of view that there is a good deal of indi- 
vidual variation in this ability. Some people ap- 
pear to be very sensitive to cues as to how 
others are feeling and reacting while others 
appear to be grossly unaware of the thoughts 
and feelings of others. This “faculty” of be- 
ing able to see things from the other per- 
son’s point of view, while it does not insure 
more respect or admiration for the other, does 
seem to assure more effective communication 
and understanding. For this reason it appears to 
be a most challenging and important area for 
investigation. 

In many ways the state of knowledge and re 
search in this area at this time is analogous to 
the situation regarding the concept of intelli- 
gence at the turn of the century. There were 
obvious differences between individuals. There 
were also differences within the same individu- 
al depending upon the type of material on 
which he was exercising this ability (in the case 
of empathy the specific other). It was a difficult 
concept to define as there were many definitions 
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current. Lastly, it seemed to be a neutral ability 
which might be used either to aid or exploit 
others. All these statements may also be made 
about the concept empathy. The author hoped 
that the same method which proved of such 
value in solving some of the important prob- 
lems in the area of intelligence, the construc- 
tion and standardization of atest, could be suc- 
cessfully applied to this area of empathy. The 
first test, which was very preliminary in nature, 
has been reported elsewhere [4]. Wide inter- 
personal differences in ability to predict the re- 
sponse of others were found in a highly homo- 
geneous population of students. These differ- 
ences were substantially greater than those an- 
ticipated on the basis of chance. 


THE TEST: RATING TEST B 


Rating Test B, a revised form of Rating 
Test A [4] was designed along similar lines to 
check the results obtained on the first test and 
to further explore this proposed ability. It, too, 
was made up of four parts. In part 1 the 
subject was required to rate himself on a five- 
point scale on each of six traits. The traits used 
throughout the test were: 


1. superior-inferior 


2. friendly-unfriendly 
3. leader-follower 
4. shy-self-assured 
5. sympathetic-unsympathetic 
6. secure-insecure 
On part 2 he was asked to rate another in- 


dividual on the same six traits. In part 3 he 
was asked to predict how the other will rate 
himself on the same six traits, and finally, in 
part 4 he must predict how the other will 
rate him (the subject), on these six items. In 
this way, a measure of one subject’s ability to 
see things from the point of view of the other 
can be derived by calculating how closely his 
predictions of the other’s ratings on part 3 and 
part 4, coincide with the other’s actual ratings 
of himself and the subject (on his part 1 and 
part 2), and vice versa. 

This test, then, requires the subject to pre- 
dict the behavior of others ina given situational 
context, namely how they will mark a simple 
rating test, and provides a check on his accura- 
cy. It seems that in order to predict accurately 
the subject would have to take the role of the 
other in order to see him from his own point 
of view and to see himself from the point of 
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view of the particular other. The two types of 
prediction were thought to be based on the 
same process, empathy, so that the two scores 
were merged into a final empathy rating. Since 
each rating is made on a five-point scale, the 
test can be scored in terms of the total number 
of points the individual is in error in his predic- 
tions. This was called the Deviation Score and 
was the one commonly used as a measure of em- 
pathy. Another method of scoring consisted in 
counting the number of predictions which were 
exactly accurate. This was called the Right 
Score. 


PROCEDURE 


The sample for this exploratory study was 
again a highly homogeneous one in that they 
were all student members of a social psychology 
class. There was a total of eighty students in 
the group; 41 males and 39 females. The age 
range was 18 to 38 with a mean of 22.7 years. 
The class members were asked to enroll in 
groups of sixteen for one of five evenings. Since 
the general tendency was to come with some 
friends, each sample of sixteen contained some 
people who knew each other very well as well 
as some who had never spoken to each other. 
However, this was not a “first impressions” 
test as all had been members of the same class 
for four months previously. 


Each person was assigned a number from one 
to sixteen. The procedure consisted of three peri- 
ods. In period one, numbers: 

1,2,3,4.—5,6,7,8.—9,10,11,12.—13,14,15,16.— 
were placed in four separate corners of the room, 
each around a table so that they constituted four 
distinct groups. The only instructions given were: 
“get to know each other as well as you can in the 
next few minutes.” After ten minutes the group 
interaction was stopped and Rating Test B was 
distributed. When it had been completed, period 
two began. In it, numbers: 

1,5,9,13.—2,6,10,14.—3,7,11,15.—4,8,12,16.— 
made up the four groups. The same procedure was 
then followed. Lastly, for period three, numbers: 
1,6,11,16.—2,7,12,13.—3,8,9,14.—4,5,10,15.— 
made up the four groups. 

In this way, each subject was, in the course of 
the evening, a member of three different groups 
and interacted with nine different others. Thus 
there were four scores available for each subject, 
one for each of the three subgroups in which he 
participated and a total score. 


RESULTS 
The Deviation Scores of the entire sample 
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ranged from 40 to 125, a difference of 85 
points. The theoretic range of errors was 0 to 
432, since each subject predicted the ratings of 
nine others on twelve items and it was possible 
to be as much as four points in error on any 
one item. The mean based on the interaction 
with nine others, was 73.2 and the SD 15.8. 
The distribution of scores conforms to a normal 
curve. 

The difference between the mean number of 
right predictions anticipated on the basis of 
chance (21.6) and the mean Right Score ac- 
tually obtained (48.5), was significant beyond 
the 1 per cent level, showing that, rough as it 
is, this device is measuring some ability other 
than chance to predict the behavior of others. 
The correlation of the two components of the 
Deviation Score, errors in prediction of what 
the others will say of themslves (part 3), and 
errors in prediction of what the others will say 
of oneself (part 4), was +.63. 


Again on the revised test the females were 
slightly better predictors than the males, the 
difference being significant at the 3 per cent 
level. It was also found that females were more 
easily predicted than males, as both males and 
females made better scores when predicting fe- 
males than males. This led to the hypothesis 
that this so-called empathic ability might be re- 
ciprocal in nature, so that it is easier to empath- 
ize with a person who has high empathy than 
with one whose empathy is low. Correlating 
the number of errors each individual made in 
predicting the responses of others (Deviation 
Score), with the total number of errors made 
by the others in attempting to predict his rat- 
ings, gave a coefficient of +.60. This seemed 
to show that on the whole it is easier to predict 
the responses of a person who is highly em- 
pathic (as measured by this test), than one of 
low empathic ability regardless of one’s own 
level of ability. Similarly, regardless of one’s 
own level of ability, it is more difficult to pre- 
dict a person with low empathy (as measured 
by this test). 


VALIDITY AND RELIABILITY 


The question of the validity and the reliabil- 
ity of this highly preliminary attempt at mea- 
surement is far from satisfactorily answered at 
this stage. Using the split-half method and cor- 
relating the score on items 1, 3, and 5 with the 
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score on items 2, 4, and 6 gave a coefficient of 
reliability of +.82. The data on validity were 
not as conclusive. However, corroborating evi- 
dence was derived from three separate sources. 


1. A judge’s rating technique. The instruc- 
tor in the course was given a list of twenty 
names containing the ten highest and ten low- 
est scorers in random order. He was asked to 
judge the empathic ability of these subjects as 
either “high” or “low.” His ratings of these 
twenty subjects agreed with the test results in 
sixteen cases. 


2. An empathic index derived from the 
TAT by a method described elsewhere [3], 
gave a mean TAT empathy rating of the ten 
lowest scorers of 18 per cent (range 8 to 30 
per cent), whereas the mean TAT empathy 
index of the ten highest scorers was 36.1 per 
cent (range 30 to 44 per cent). 

3. At the close of the experiment, the sub- 
jects were asked to write a brief account of the 
bases on which they made their predictions. 
They were instructed to introspect and report 
as clearly as possible the operations through 
which they had gone and the data they used. By 
the subjects’ own reports, they were trying, for 
the most part, to take the role of the other and 
so get his perspective. 


INSIGHT AND EMPATHY 


To investigate the relation of insight to em- 
pathy Allport’s criterion of insight was used, 
“the relation of what he thinks he has to what 
others think he has” [1]. An insight index was 
derived from Rating Test B by comparing a 
subject’s own judgments about himself (on 
part 1) to the judgments that the others made 
about him (on their part 2). The number of 
coinciding ratings was then totaled. Since each 
subject rated himself on six items nine times, 
and these were compared to the ratings of him 
by nine others, the highest number of coincid- 
ing ratings, or insightful ratings obtainable 
was 54. The actual mean frequency of coin- 
ciding judgments was 22.6 and the range 11 
to 38. The correlation of insight scores with 
Deviation Scores was +.65. Having a self-con- 
ception which agreed well with the conception 
that others had of one (insight) seemed to be 
highly related to the ability to take the role of 
the other (empathy) as measured by this test. 

Although this work was based on a sample 
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which was both too small and too specialized in 
nature for any general conclusions to be 
drawn, the work has resulted in some fruitful 
leads. Rather wide individual differences in 
ability were discovered within a homogeneous 
group. It is presumed that a wider sampling of 
the general population would result in even 
wider differences, particularly if the abnormal 
population be included. Not only were inter- 
personal variations of considerable extent evi- 
denced, but there were also intrapersonal varia- 
tions in ability which appeared to be due to 
several factors; the particular other being 
rated, his empathy with the subject, the motiva- 
tion of the subject, and, tentatively, the sub- 
ject’s familiarity with the role of the other. 


CASE STUDIES OF THE EXTREMES 


In order to gain some insight into the per- 
sonality and life history factors behind these 
differences in ability, a small group of extreme- 
ly high and extremely low scorers was selected 
for further study. 

The rough criterion of selection was those 
whose scores put them more than one standard 
deviation away from the mean in either direc- 
tion. This gave a sample of six whose scores 
ranked them as highly empathic and seven 
whose scores ranked them as very low in em- 
pathic ability. Although these are extremely 
small groups, some of the differences between 
the two groups are so striking that they seem 
worth while reporting. The hypotheses to which 
these results lead, should, of course, be checked 
on larger and more representative samples. The 
composition of the two extreme groups is 
shown in Table 1. 


TABLE 1 
CoMPpoOSITION OF EXTREME EMPATHY GROUPS 














Deviation 
Age Sex Score 
Mean Range Male Female Mean Range 
Highs 
N=—6 25.1 18-33 3 3 43.5 40-55 
Lows 
N=vT7 22.0 19-23 4 3 97.0 89-125 





The thirteen extreme subjects made individ- 
ual appointments to take a series of personality 
tests. They were not told that they were ex- 
treme scorers on the Rating Test, but only that 
they were part of a sample that had been 
drawn from the total group for further study. 

The four tests used were the Wechsler- 
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Bellevue Adult Intelligence Test, Form I 
[13], the Murray and Morgan Thematic Ap- 
perception Test, Third Revision [10], the Ror- 
schach inkblot test [12], and the California 
Ethnocentrism Test [5]. 


Two meetings were held with each subject. 
During the first the Wechsler was admin- 
istered, scored and the results discussed with 
the subjects. Then the TAT was admin- 
istered. Only eleven pictures were used to make 
administration in one period practical. In the 
second meeting the Rorschach was admin- 
istered, then the TAT results were discussed. 
Following this the subjects were interviewed 
about their family background, personal rela- 
tionships, future plans, present problems, and 
so forth. Lastly, the subjects were given coded 
copies of the California Ethnocentrism Test 
which they were requested to fill out and mail 
in unsigned. Further information on these sub- 
jects was available in the form of a Self-Analy- 
sis written as an assignment for the course in 
social psychology. 


WECHSLER DIFFERENCES OF EXTREME GROUPS 


There did not, at first, seem to be any signi- 
ficant difference in the intelligence of the two 
groups. The mean full IQ of the “good” em- 
pathizers was 132.1 and the range 127-137, 
and the mean full IQ of the “poor” empa- 
thizers was 126.4 and the range 105 to 138. 
However there was a striking difference be- 
tween the verbal and performance scores of 
the two groups (Table 2). 


TABLE 2 


COMPARISON OF THE MEAN VERBAL AND MEAN 
PERFORMANCE IQ’s OF THE EXTREMELY 
HicuH AND Low EMPATHIZERS 





Verbal Dios 


Performance 
IQ 1Q 
High Empathy Group 128.6 130.5 
Low Empathy Group 130.0 116.5 





This group difference in the two scores of 
the low group was not caused by a few extreme 
cases, as it was observed in all but one of the 
seven low cases. These people appear to func- 
tion best on the abstract verbal level but seem 
to be somewhat at a loss to deal with concrete 
situations and particularly as they relate to 
other people. 
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RORSCHACH DIFFERENCES BETWEEN EXTREME 
GROUPS 


The Rorschach differences between the two 
extreme groups were also striking. The first 
significant factor was the F per cent. The F 
per cent is supposed to represent the degree of 
control the subject has over his personal spon- 
taneity. An F per cent up to 50 is considered 
normal but between 50 and 80 it frequently 
corresponds to signs of inflexibility and con- 
striction, that is, excessive control amounting 
to repressiveness. The mean F per cent of the 
good or high empathizers was 33.5 per cent 
and of the low or poor empathizers 47 per cent. 
None of the high group had an F per cent as 
high as 50, whereas five of the seven lows had 
F per cents of 50 or over, one being as high as 
70. This points to the possibility that those 
low in empathy may be more rigid than those 
who are higher. 

The second important difference was in ex- 
perience balance or Erlebnistyp which refers to 
the extent the individual is responsive to 
promptings from within (introverted), or 
without (extroverted). This balance is indi- 
cated primarily by the 1: sum C ratio. The 
results of the testing of the two small extreme 
groups show that while the 17: sum C ratios 
of the highs tend to be weighted on the C side 
(showing a preponderance of color or emotion- 
al responses), the ratios of the lows are 
weighted in the opposite direction (showing 
more responsiveness to intellectual promptings 
and promptings from within generally) (Ta- 


ble 3). 


TABLE 3 
CERTAIN RORSCHACH RATIOS OF HIGH AND 
Low EMPATHIZERS 


M : sum C Ratios FC:CF-+C Ratios _ 


High Low High Low 
Empathy Empathy Empathy Empathy 
0:41, 2:14 0:4 1:0 
3:41, 2:1 3.2 0:2 
2:3 6:1 2:2 0:1 
3:1 10:7 2:0 2:6 
2:2Y, 4:3 2:1 0:3 
2:2 7:2 2:0 2:1 

3:11Y, 1:1 


i 


Klopfer and Kelly state that: 


If M exceeds twice sum C there is too little 
affective energy in emotional contact with the out- 
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side world due either to withdrawal or repression 
[7, p. 230]. 


It will be noted that in five of the seven low 
cases M either equals or exceeds twice sum C. 
Having observed this tendency for the lows to 
be overbalanced on the M, or intellectual side 
and underbalanced on the C, or emotional, and 
that the reverse is the indication for the high 
empathizers, other indicators of emotional bal- 
ance were checked. FC responses, which are 
supposed to represent properly channelized 
emotional impulses, emotion that the subject is 
not trying to repress and yet which is expressed 
in forms consistent with the situational de- 
mands, should exceed CF+C. These latter 
represent more impulsive emotional reactions. 
If FC is not greater than CF+C there is dan- 
ger of insufficient outer control. 

Four of the seven lows had a total of CF +C 
which exceeded their FC total, showing tenden- 
cies toward uncontrolled emotional outbursts 
rather than well controlled emotional contact 
with others (Table 3). 

Klopfer and Kelly state: 


It is easily understandable that a combination of 
unchecked predominance of CF over FC with a 
predominance of FM over M gives the most unfa- 
vorable picture: a mixture of infantilism with lack 


of control [7, p. 230]. 


All four of the lows who had CF +C greater 
than FC also had higher FM than M. There 
is only one case where this is true in the high 
group. 

Another Rorschach difference between these 
two small extreme groups was in the amount 
of texture responses, or Fc, produced by each 
group. Only three of the seven lows produced 
any Fc, whereas five of the six highs had some 
in their records. This is a rather interesting 
difference since Fc is claimed to be an indicator 
of sensitivity and tact. 

If one can generalize at this point about the 
personality patterns of these two groups from 
the common signs they exhibit and the differen- 
ces between them, those who have low empath- 
ic ability on Rating Test B tend to be rather 
immature, introverted, motivated from within, 
but have some conflict with these motivations. 
They are rather rigid and constricted, con- 
sciously attempting to keep themselves under 
control. They are afraid of emotion and are 
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not capable of forming many good emotional 
contacts with the outside world. Their emo- 
tions are more of the explosive variety, which 
are held back and so build up until they explode 
and overflow the controls. 

On the other hand those whose empathy ap- 
pears to be high from their scores, show rather 
extroverted personality structures, being more 
responsive to promptings from without. They 
are in better balance and more at peace with 
themselves, not showing as many signs of anxi- 
ety and depression as those whose ability is 
low. Their emotional contact with others is 
wider, they are adaptive to others and their 
contacts are rich and satisfying. They are sensi- 
tive to the feelings of others and display social 
tact. 


TAT DIFFERENCES OF EXTREMES 
Although the TAT analyses showed the sub- 


jects to have very different personalities, there 
were certain common elements to be found in 
the highly empathic group, and others in the 
low group, which marked these off as distinct. 
These factors were: 

1. Family atmosphere and relations. 

2. Orientation to others. 

3. Goals. 

4. Concept of Self. 


1. Family atmosphere and relations as seen 
by the subjects. The highly empathic subjects 
seem to come from families where the interper- 
sonal relations were close. There may have 
been problems and difficulties, but there was 
an underlying love and willingness to work 
things out. The family is a source of support 
and not an important problem area in the lives 
of the highs. 


Quite the opposite is the case in the low 
group. The family life of those in this category 
was, and is, any thing but smooth. There is a 
good deal of aggression against parental au- 
thority, conflict with siblings, hostility against 
overprotecting mothers, and frequently dis- 
rupted relationships between the parents them- 
selves either through death or incompatibility. 

2. Orientation to others. The highs as a 
group, have a great interest in other people. 
They are warm, affectionate people whose hap- 
piness requires good relations with others. They 
have many friends and are equalitarian in their 
relationships. 
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The lows, as a group, are more frequently 
lone wolves. They mistrust others and are 
afraid of getting hurt. ‘They are unable to re- 
late themselves to others successfully even 
when they desire to. They are most often ego- 
centric in their relationships, using other people 
for their own purposes, for the feeling of 
power and status it gives them. They need to 
dominate the other in most relationships. They 
have few friends and find it hard to make 
friends. They are unable to “give” emotionally 
to others; they are “takers.” The lows seem to 
be people who are not basically interested in 
other people except as they contribute to their 
own welfare. 

3. Major goals. The major goals of those 
whose scores identified them as highly empathic 
were most frequently in terms of home and 
family. Their concept of happiness involves 
forming a relationship with another 
which would be lasting and satisfying. Their 
main drive could be summed up as seeking to 
establish a close, mutually interdependent, re- 
lationship. 


Cc l ose 


The goals of the lows, on the other hand, 
were more in terms of self-aggrandizment, be- 
coming well known through some occupation 
where others would look up to them. There 
was a remarkable lack of love themes in their 
stories particularly in view of their age. Their 
main drive appears to be for success which is 
achieved either through individualistic achieve- 
ment or dominance of others. 

4. Concept of self. The highly empathic sub- 
jects picture themselves as sensitive, idealistic, 
romantics, who are aware of social problems 
and have sympathy for the underdogs. They 
are aware of their need for others in order to 
fulfill themselves. 

The lows admire people who are capable, 
confident, cautious, rational and under control. 
They are frequently insecure people who have 
built up a shell of superiority and not caring 
about others in order to protect themselves. 

The free answers to the California Ethno- 
centrism Test corroborated and amplified some 
of these same points. 


SUMMARY 


The combined results of the Wechsler, the 
Rorschach, the TAT, and the California Eth- 


nocentrism test, together with the subjects’ 
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own self-analyses, gave a picture of those whose 
empathy is high as outgoing, optimistic, warm, 
emotional people, who have a strong interest 
in others. They are flexible people whose emo- 
tional relations with others, particularly their 
early family relations have been sufficiently 
satisfying so that they find investing emotion- 
ally in others rewarding. Their own level of 
security is such that they can afford an interest 
in others. While they are emotional people, 
their emotionality is well controlled and richly 
enjoyed. 

Those low in empathy are rather rigid, in- 
troverted people who are subject to outbursts 
of uncontrolled emotionality. They seem un- 
able to deal with concrete material and inter- 
personal relations very successfully. They are 
either self-centered and demanding in their 
emotional contacts or else lone wolves who pre- 
fer to get along without strong ties to other 
people. Their own early emotional relationships 
within the family seem to have been so dis- 
turbed and unsatisfying that they feel they can- 
not afford to invest their love in others as they 
need it all for themselves. They seem to mis- 
trust others, to encapsulate themselves and not 
to be well integrated with the world of reality. 
They seem to compensate for their lack of emo- 
tional development by stressing the abstract in- 
tellectual approach to life as the safest. Some 
of those in this group seemed to be aware of 
their patterns and of the unsatisfactory nature 
of their adjustment to other people ; others have 
rationalized their behavior to the extent of de- 
veloping a role of superiority which satisfies 
them. The mere fact that they are so inwardly 
oriented and rigid in their structure makes it 
impossible for them to empathize with others 
successfully. It is unimportant to them to know 
what the other is thinking and feeling; it is 
their own thoughts and feelings that count. 

While this work is in every way preliminary 
and inconclusive, it seems to have provided 
some rather interesting hypotheses to be tested, 
some clarification of problems involved, and 
perhaps some techniques to aid in their resolu- 
tion. 


Received December 7, 1949. 
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Psychodiagnostics. Berne: 


DISCRIMINATION BETWEEN MATCHED SCHIZO- 
PHRENICS AND NORMALS BY THE 
WECHSLER-BELLEVUE SCALE’ 


A. EDWIN HARPER, JR. 
EWING CHRISTIAN COLLEGE 
ALLAHABAD, U. P., INDIA 


LTHOUGH some investigators, not- 

ably Rapaport et al. [15], have at- 

tempted to differentiate among the var- 
ious subcategories of schizophrenia, as did the 
previous study of this series [6], the most popu- 
lar type of analysis has been that of schizophre- 
nics vs. normals. This is natural inasmuch as 
(1) it is difficult to get very large samples of 
schizophrenic subgroups, and (2) there is much 
less diagnostic uncertainty and much more 
clear-cut differentiation between a hospitalized 
schizophrenic group and an unhospitalized nor- 
mal population. There have, however, been 
many disagreements as to just what subtests 
normals and schizophrenics differ on, and how 
they differ. For just one example, Rabin [13] 
states that, relatively speaking, Comprehension 
does not deteriorate in schizophrenia, and Ob- 
ject Assembly does; Olch [12] finds the op- 
posite, that Object Assembly does not deterior- 
ate with schizophrenia, but Comprehension 
does. The use of small samples and relatively 
inefficient statistical techniques have been cited 
as two of the possible reasons for these differ- 
ences. 

SUBJECTS 


The schizophrenic sample of 245 cases has 
been described in the previous paper [6]. With 
this schizophrenic sample, a group of 237 nor- 
mal records was matched for age and full scale 
IQ. The closeness of the means and standard 
deviations of these two distributions was tested 
by the critical ratio technique. The highest of 


1This report is based on a portion of a disserta- 
tion submitted to Columbia University in partial 
fulfillment of the requirements for the degree of 
Doctor of Philosophy. The author gratefully ack- 
nowledges the wise and fruitful guidance received 
from Dr. Irving Lorge, Dr. Robert L. Thorndike, 
Dr. Nicholas Hobbs, and Dr. Percival M. Symonds. 
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the four CR’s was only .559, thus indicating 
reasonably close matching. The “normals” 
were normal in the sense that they were not in 
mental hospitals. Most of them had been tested 
by psychologists in connection with professional 
services, and when there was any suspicion of 
“emotional maladjustment” the record was 
not used. 


The method of matching, by age and IQ, ar- 
bitrarily eliminated any possible differences 
which might have come from deterioration of 
the total 1Q itself. However, this seems more 
than compensated for by the fact that any dif- 
ferences found on the subtests can be assumed 
to be real subtest differences, and not just re- 


flections of CA or IQ differences. 


PROCEDURE 


The matched groups of 245 schizophrenics 
and 237 normals were compared by means of 
Fisher’s discriminant function, an adaptation 
of the multiple regression technique to the 
problem of an undistributed dichotomous cri- 
terion. The raw scores of the ten subtests were 
used, plus scores representing education, and in- 
trasubtest scatters [5, 6] on eight of the sub- 
tests. However, since the prediction on the 
basis of all nineteen variables was only three 
per cent better than that on the basis of the ten 
subtests alone, only the latter part of the study 
will be reported. 


RESULTS 


The multiple correlation for the differentia- 
tion of the matched schizophrenics and nor- 
mals, on the basis of the ten raw subtest scores 
of the Wechsler-Bellevue, was R = .437. This 
is significant at better than the | per cent level 
of confidence. Squaring this figure we find, 
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however, that only 19 per cent of the variance 
between these two groups has actually been ac- 
counted for in the multiple R. 

Table 1 gives an indication of the relative 
contributions of the various subtests to this dif- 
ferentiation. The beta values may be compared 
directly with one another, as they represent re- 
gression weights with original subtest variabili- 
ties equalized, i.e., regression weights for a 
series of standard scores. It can thus be seen 
that the Digit Symbol subtest was the most 
important one for differentiating between 
schizophrenics and normals, and that Informa- 
tion and Block Design were close behind. Arith- 
metic, Object Assembly and Picture Arrange- 
ment contributed least to the differentiation. 

TABLE 1 


BeTA AND b WEIGHTS DIFFERENTIATING SCH1ZO- 
PHRENICS AND NORMAL CONTROLS 


Beta Weights _-b Weights 


Verbal Subtests 


1. Information +.275 +.0261 
2. Comprehension —.153 —.0209 
3. Digit Span +.171 +.0379 
4. Arithmetic —.035 —.0063 
5. Similarities —.122 —.0137 
Performance Subtests 

6. Picture 
Arrangement + .097 -+-.0120 

7. Picture 
Completion —.195 —.0329 
8. Block Design +.252 + .0166 
9. Object Assembly +.062 +.0071 
10. Digit Symbol —.356 —.0143 
+.4316 





The weights to be applied to the raw subtest 
scores themselves are the } weights in Table 1. 
The full regression equation discriminating be- 
tween the 245 schizophrenics and the 237 
matched normals of this study was as follows, 
(using the familiar abbreviations to represent 
the raw subtest scores): Mean Predicted 
Score = + .0261 J — .0209 C + .0379 D— 
.0063 A — .0137 § + .0120 PA — .0329 PC 
+ 0166 BD + .0071 OA — .0143 DS + 
.4316. This is the equation to be applied to the 
raw subtest scores of any test record to deter- 
mine whether it is “schizophrenic” or “normal.” 

Applying these regression weights to the 
mean raw scores for the schizophrenic and nor- 
mal samples, mean predicted scores were ob- 
tained: + .4029 for the normals and '+.5939 
for the schizophrenics. The critical ratio of 
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nearly 10 between these two scores again indi- 
cated that the groups were reliably differenti- 
ated. The midpoint between these two scores, 
+ .4984, is the cutting-line for diagnosing 
“schizophrenia” or “normality” from the sub- 
test pattern of any particular subject. From 
the multipie R it was calculated that 33 out of 
every 100 schizophrenics would be misdiag- 
nosed as “normal” by the application of the 
regression weights developed in this study; the 
remaining 67 per cent would be correctly diag- 
nosed as schizophrenic. 


APPLICATION OF THE WEIGHTS TO A NEW 
SAMPLE 


The diagnostic systems worked out by previ- 
ous investigators have often been applied to the 
problem of diagnosing new samples, but this 
has always been done by investigators other 
than the original, and often with conflicting 
results (e.g., Johnson’s [7] and Levine’s [8] 
studies of Wechsler’s [17] “‘signs’’). None of 
the investigators who has worked out a diag- 
nostic system has reported a cross-validation of 
his weightings on a new sample of the same 
population. It is therefore not known how 
much of the divergence of the results of new 
investigations is due to the statistical regression 
expected of any predictive formula, and how 
much is due to the fact that the new samples 
may not be homogeneous with the original in 
age, intelligence, cultural background, “cooper- 
ativeness” with testing, education, and a host 
of other possible factors. 

For the present study, the “cross-validation 
sample was composed of all of the schizophre- 
nics? who were tested in the nine months im- 
mediately following the gathering of the origin- 
al sample. These 56 schizophrenics differed 
from the original group only in the fact that 
diagnoses had not (in most cases) been con- 
firmed by as long a period of hospital residence. 

When the regression weights were applied 
to each of the 56 test records of the new 
sample, the computed scores ranged from 
+ .1306 to + 1.1555. Of these 56 scores, 38 
were above the cutting score of + .4984 and 
18 were below. Thus 68 per cent of these 
schizophrenics were correctly diagnosed, and 32 
per cent were incorrectly diagnosed. This is 


2Because of the practical difficulties involved, a 
comparable sample of normals was not obtained. 
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reasonably close to our expectations of 67 per 
cent vs. 33 per cent. Thus the regression 
weights and, by implication, this form of scat- 
ter analysis, have proved successful in discrim- 
inating schizophrenics from normals on a bet- 
ter than chance level. 


Two general cautions should be given here, 
however, for any investigator who might intend 
to apply these weights to a new schizophrenic 
population. In the first place, the regression 
equation was developed for discrimination be- 
tween schizophrenics and normals; it cannot be 
expected to discriminate between schizophrenics 
and other mental hospital populations. Second- 
ly, its value is directly relevant only to the par- 
ticular group in which it was calculated. It 
will be useful for other schizophrenic groups 
only insofar as they match the experimental 
group in various important characteristics. 
What these characteristics are is, unfortunat- 
ely, not known. Among the possibilities are age 
and intelligence distribution, general socioecon- 
omic and cultural background, and the particu- 
lar diagnostic considerations which are 
weighted most heavily by the particular group 
of psychiatrists involved. It is quite probable 
that the regression weights will work less well 
in a different hospital. 


COMPARISON WITH OTHER STUDIES 


As was mentioned in the previous study [6], 
direct comparison with the work of other in- 
vestigators is difficult. The primary reason is 
the fact that other investigators have invariably 
used Wechsler’s weighted scores, whereas this 
study, for the sake of greater accuracy, has 
used the raw scores. The accurate translation 
of mean raw scores into strictly equivalent 
mean weighted scores would be an extremely 
complicated process, because of the uneven dif- 
ferences in the two distributions. For example, 
when 20 raw scores are fitted as whole num- 
bers into a curve of 18 weighted scores [17, p. 
228], some weighted scores represent one raw 
score each, and some weighted scores represent 
two raw scores each. The correlation between 
raw and weighted scores is never 1.00. Never- 
theless, rough comparisons must and will be 
made. Raw score means have been translated 
into roughly equivalent weighted score means, 
and despite minor errors in this process, the 
rank orders can be assumed to be fairly ac- 


curate. 
TABLE 2 
MEAN ScorRES AND POINT BISeRIAL CORRELATIONS OF 
THE 245 SCHIZOPHRENICS AND THE 237 
NorRMAL CONTROLS 
: Mean Raw Scores Point 
biserial 
Schizo- Normal Correla- 
phrenics Controls tions 
(245) (237) 
Verbal Subtesis 
1. Information 11.99 11.04 .090* 
2. Comprehension 8.47 9.11 .088* 
3. Digit Span 10.16 9.72 .099* 
4. Arithmetic 5.31 5.26 010 
5. Similarities 8.43 9.13 079 
Performance Subtests 
6. Picture Arrangement 7.80 7.76 .005 
7. Picture Completion 8.00 8.68 115* 
8. Block Design 15.47 13.96 099* 
9. Object Assembly 15.93 15.48 .053 
10. Digit Symbol! 28.69 33.97 .213* 
Intelligence Quotients 
ll. Verbal IQ 89.67 89.12 017 
12. Performance IQ 88.56 89.89 042 
13. Full Seale 1Q 88.25 88.68 013 
14. Chronological Age 31.94 31.50 .025 
15. Education 9.52 9.08 084 


*Correlation 
follows: 


Levels of confidence are as 
-115. 


significant. 
5 per cent = .088; 1 per cent = 


But first it is well to examine the basic data 
for these comparisons. Table 2 gives the raw 
score means for the 245 schizophrenics and the 
237 normals, which were matched on the basis 
of CA and full scale IQ. We can see from this 
table that the schizophrenics are higher on In- 
formation, Digit Span, Arithmetic, Picture Ar- 
rangement, Block Design, and Object Assem- 
bly; whereas the normals excel in Comprehen- 
sion, Similarities, Picture Completion, and 
Digit Symbol. The reader familiar with former 
studies will immediately recognize that many 
of these differences are in the directions found 
by former investigators. But the differences here 
are all small. Only on Block Design and Digit 
Symbol does the difference exceed one raw 
score point and, translating roughly to Wechs- 
ler’s weighted scores, only the Digit Symbol 
difference is above one weighted score. How- 
ever, in the third column of Table 2, the point 
biserial correlations indicate that many of these 
differences are significant. Only the differences 
in Digit Symbol and Picture Completion are 
significant at the 1 per cent level, but differ- 
ences in Digit Span, Block Design, Informa- 
tion and Comprehension are added by lowering 
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the standard of significance to the 5 per cent 
level of confidence. 

Olch [12] has tabulated the subtest rank or- 
ders from several former studies. These data 
may be compared, roughly, with those of this 
study by the use of p, the rank order correla- 
tion. The p’s comparing the 245 schizophrenics 
of this study with those of other investigations 
are as follows: with Rabin (N = 19) = +.19; 
with Weider’s younger group (N = 20) = 
+ .34; with Weider’s older group (N = 30) 
= + .45; with Magaret (N = 30) = + .84; 
with Olch Total (her Table III) (N = 32) 
= + .71; with Rapaport Total (N =72)*= 
+ .62. None of these correlations is startlingly 
high; some are extremely low. The small N’s 
involved, of course, cast doubt upon many of 
these comparisons; the p’s do, however, reflect 
not only the differences between this and pre- 
vious studies, but also the differences among the 
previous studies themselves. While these inter- 
correlations among other studies are not here 
reported, they range from + .80 for Rapa- 
port’s total vs. Magaret to — .02 for Weider’s 
younger vs. Olch’s younger. It seems unlikely 
that more than a small portion of this lack of 
agreement is due to the lumping together of a 
wider age range in the present study than in 
some of the previous ones. 


Comparing with Garfield [3], the 245 
schizophrenics in this study deviated from the 
237 normals in approximately the same manner 
as did his 67 schizophrenics from their control 
group of 46 neurotics, psychopaths, and alco- 
holics. The data of this study also agree with 
Garfield against Wechsler’s conclusions [13], 
on an unspecified population, that Object As- 
sembly is lower than Block Design in schizo- 
phrenics. Like Garfield’s data, however, these 
also confirm Wechsler’s statement that the sum 
of Picture Arrangement and Completion is 
less than the sum of Information and Block 
Design. 


As with Johnson [7] and Levine [8], the 
data in this study cast general doubt on Wechs- 
ler’s criteria. Though, beyond the facts given 
in the previous paragraph, no other specific 


8The group here differs from the 7 that Olch 
used, inasmuch as it contains all of Rapaport’s 
schizophrenic groups. This seemed a more repre- 
sentative comparison, because other investigators had 
— the kind of differentiations that Rapaport 
made. 
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agreements or disagreements with Wechsler 
can be reported. It is, in fact, difficult to agree 
or disagree definitely with Wechsler’s criteria, 
because the criteria themselves often give ra- 
ther broad ranges of deviation rather than defi- 
nite mean amounts [17, p. 150]. His conten- 
tion, as is Rapaport’s, is that the schizophrenics 
are such a heterogeneous group that no one 
set of signs can be valid for them all. This con- 
tention is well taken. If, however, a particular 
set of signs is diagnostic only for particular 
kinds of schizophrenics, then methods should 
be given for distinguishing between those 
schizophrenics for whom the signs will or will 
not be diagnostic. If this differentiation is 
made, as one surmises from Rapaport’s discus- 
sions, on the basis of types of verbalization and 
behavior, i.e. the criteria used also by the psy- 
chiatrist, then the analysis of subtest patterns 
seems superfluous. 


The index obtained by applying Rabin’s 
Schizophrenic Ratio [13] to the present data 
is 1.0, which is approximately the same as that 
obtained by applying the ratio to Rapaport’s 
schizophrenics [15], or to Rapaport’s or Wechs- 
ler’s normals. These findings confirm those of 
Webb [16] and Garfield [3]. In fact the low 
relationship of Rabin’s rank orders, not only 
with the present study (p = + .19) but also 
with those of Rapaport (p = + .18), Maga- 
ret (p = + .06), Weider (p = + .59 for the 
younger and + .38 for the older subjects), and 
Olch (p = + .02 for the younger and + .22 
for the older subjects), suggests that his sample 
of 19 schizophrenics was somewhat different 
from those of other studies. 


Not only the schizophrenics, but also the 
normal controls seem to differ from study to 
study. Rabin’s Schizophrenic Index [13] pro- 
duces different results when applied to his nor- 
mals, to Rapaport’s normals, to Wechsler’s 
normals, and to the normals of these data. The 
rank correlations for the subtests of the 237 
normals of this study vs. Rapaport’s 54 nor- 
mals is only p = + .55. The correlation with 
Olch’s total group (taken from Wechsler [17], 
N = 690) is p = + .78. Olch’s 345 younger 
and 345 older normals (taken from Wechsler) 
correlate with each other, p—=—.27, a differ- 
ence which seems more than should be expected 
between two adjacent age groups (CA 17-29 
and CA 30-49). Comparing the subtest devia- 
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tions reported by Lewinski [9], Weider [18], 
Rapaport et al. [15], and the present study, 
with those given by Wechsler [17] for both 
average intelligence and for feeblemindedness, 
the differences among the normal subjects used 


in these studies seem to be consistent neither 


with age nor with intellectual level. These dif- 
ferences seem to deserve more thorough investi- 
gation than can be given them within the scope 
of this paper. It is at least possible that they re- 
flect differences in cultural backgrounds, as 
well as different ranges in age and ability. 


DISCUSSION 


The amount of variability, the lack of any 
general agreement, and the often striking dif- 
ferences among the various studies reviewed in 
this paper raises serious questions as to the va- 
lidity of the whole process of pattern analysis. 
Differences in range of age have been held 
largely responsible; yet the lack of high inter- 
relationship between the age-matched groups 
of Rabin, Weider, Magaret, and Olch casts 
doubt on differences in age-range as an ade- 
quate explanation of divergent findings. The 
rank order correlation of the subtest ranks in 
these studies suggests rather, that interhospital 
differences may be responsible. Weider’s older 
and younger schizophrenics correlate more 
closely with each other than either of these two 
groups correlates with any other group; the 
same is true of Olch’s older and younger schiz- 
ophrenics. Webb [16], after showing the lack 
of relevance of Rabin’s schizophrenic index for 
Rapaport’s schizophrenics, has suggested that 
possibly both Wechsler’s norms and Rabin’s 
ratio “become invalid when applied to the 
particular population utilized by Rapaport.” 
It is probable that both cultural differences (cf. 
Machover [11] ) and differences in IQ range 
(cf. Estes [2] and Lewinski [9] ) are impor- 
tant. Differences in nosological considerations 
have been emphasized by Rapaport [15], and 
also contribute to these interhospital differen- 
ces. 

More basic than any of these reasons for 
difference, however, are those found in the na- 
ture of the Wechsler-Bellevue test itself, and 
the assumptions commonly made about it. 
First, it has seldom been explicitly recognized 
that the significance of a difference between 
subtests is dependent less on the normal stand- 
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ard deviations of the subtests than on the indi- 
vidual reliabilities of the subtests, and the in- 
terrelationships between them. Although the re- 
liabilities of the individual subtests on normal 
subjects were not available when this study 
was made, both Rabin’s studies [14] on a non- 
schizophrenic mental hospital population and 
Hamister’s [4] on a schizophrenic group sug- 
gested that these are seldom high enough to 
justify the consideration of any but fairly wide 
individual differences as significant. 

Second and third, two unjustified and in fact 
contradictory assumptions have been made, 
often by the same investigator. The first of 
these is the assumption that the subtests mea- 
Elaborate “ration- 
ales” of the subtests have been developed to ex- 


sure “specific functions.”’ 
plain why schizophrenics do well on one and 
poorly on another subtest. When the facts un- 
covered by a new investigator change, there is 
a tendency to change the “rationale’’ also. Ac- 
tually, both Balinsky [1] and Lorge [10] have 
shown beyond a doubt that the subtests are not 
factorially pure. In the 35-44 year age group, 
for instance, Arithmetic has significant loadings 
[1] for three different factors. Furthermore, 
the factorial composition of the subtests 
changes, and changes inconsistently, with age. 
With this inconsistent mixing of several fac- 
tors in a subtest, the effect of factors which 
change with psychosis may well be minimized 
by the counter effect of those which do not de- 
teriorate. It is only as the subtests represent 
fairly pure factors, which are relatively unre- 
lated in a normal population (as differential 
subtest analysis has apparently wrongly as- 
sumed to be true for the Wechsler-Belleviue), 
that a truly consistent pattern of differences 
can be expected to emerge in connection with a 
psychosis. That significant differences actually 
have been shown on the basis of the Wechsler- 
Bellevue subtests is more to be wondered at 
than to be taken for granted. It suggests very 
strongly the possibility that, were the tests of 
greater factorial purity, much stronger patterns 
might well emerge. 


A third assumption has been made in attrib- 
uting significance to “interest variability.” 
(This term refers to the observation and mea- 
surement of the extent of the differences among 
subtests.) It assumes that the Wechsler-Belle- 
vue is a global measure of general intelligence, 
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that the individual subtests are highly interre- 
lated, and that any large differences between 
them are the indications of psychotic variability. 
This assumption is, of course, the exact oppos- 
ite of the assumption behind pattern analysis. 
In actual practice, however, these two contra- 
dictory assumptions are harmonized by the the- 
ory that a portion of the score of each subtest 
represents its interrelationship with the other 
subtests in measuring global intelligence, and 
the rest of the score is a unique measure of a 
“specific function.” (As pointed out before, a 
fairly large portion of each score should prob- 
ably be assigned to errors of measurement, al- 
so.) Lorge [10] has shown, however, that one 
factor (which is “something like Spearman’s 
g’’) accounts for varying proportions of the to- 
tal in different age groups. For the 25-29 year- 
old group, for instance, it accounts for 27 per 
cent of the total variability; for the 50-59 year- 
old group it accounts for nearly twice as much 
(50 per cent). Lorge concludes that the Wechs- 
ler-Bellevue Scale “has not yet solved the prob- 
lem of measurement of adult intelligence either 
globally or differentially.”” If this is true, the 
joint assumptions inherent in the measurement 
of “pattern analysis’ and “intertest variabil- 
ity’ are on very shaky ground indeed. 


SUMMARY AND CONCLUSIONS 


Because of the many disagreements among 
various investigations of the subtest pattern of 
schizophrenics on the Wechsler-Bellevue Scale 
of adult intelligence, a new study using a larger 
sample population and more efficient statistical 
techniques seemed in order. For the present 
study a sample consisting of 245 schizophrenics 
was compared with a group of 237 normals, 
closely matched for age and full scale IQ. The 
comparisons were made between raw (rather 
than weighted) scores, by means of Fisher’s 
Discriminant Function, a multiple regression 
technique. The obtained regression equation 
was cross-validated on a new sample of 56 
schizophrenics from the same hospital as the 
original sample. 


1. The multiple differentiation between the 
schizophrenic and normal groups was R = 
437. Thus only 19 per cent of the variance be- 
tween these two groups was accounted for by 
this multiple R. Of a new group, 67 per cent 
should be correctly and 33 per cent incorrectly 
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“diagnosed,” on the basis of the regression 
formula. 


2. Examination of the deta weights indicates 
that the Digit Symbol subtest contributed most 
to the differentiation, and that Information and 
Block Design were next in importance. Arith- 
metic, Object Assembly and Picture Arrange- 
ment contributed least to the differentiation be- 
tween schizophrenics and normals. 

3. The regression equation was applied to 
the schizophrenic and normal subtest means, 
giving predicted mean scores of + .5939 and 

.4029 respectively, with a midpoint (cutting 
.4984. The critical ratio between 
these two means was nearly 10, thus again in- 


score) of + 


dicating a significant differentiation. 

4. When the regression equation was applied 
to the 56 schizophrenics of the cross-validation 
sample, 68 per cent were correctly and 32 per 
cent incorrectly “‘diagnosed.” 

5. Approximate weighted scores were com- 
puted for the raw score means in this study, 
for the with other 
studies. other studies by 
means of the rank order correlation p, ranged 
from + .19 to 


purpose of comparison 
Correlations with 
.84, but were generally low. 
Intercorrelations among other studies ranged 
from p= 
erally low. 


.02 to + .80, but were also gen- 


6. There was also disagreement over the 
subtest means of the normals used in this and 
various other studies, rank order p’s ranging 
from —.27 to +.80. Part of the disagreement 
among studies may be related to these differ- 
ences in the normal controls used. 

7. Such comparisons as have been made are 
put on an only tentative basis by the small num- 
bers of subjects and relatively inefficient statis- 
tical techniques employed in most of the studies 
involved. 

8. Discrepancies among various studies may 
also, often, be traced to differences in range of 
CA, IQ, cultural background, etc. There is 
more agreement among different-aged schizo- 
phrenics in a single hospital, than among com- 
parably-aged groups in different hospitals. 

9. Because researches have supported neither 
the basic assumption of subtest reliability nor 
the contradictory assumptions of factorial pur- 
ity of the subtests, and homogeneity of the 
Wechsler-Bellevue as a global measure, mea- 
sures of both “subtest pattern” and “intertest 
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variability” are on rather shaky ground. 


10. ‘The two major general conclusions of 


this study (including both this and the previous 
paper) are that (a) subtest pattern analysis by 
means of the Discriminant Function regression 
equation technique is a valid and powerful 
means of differentiating among various clinic- 
al and normal groups; but that (b) the Wechs- 
ler-Bellevue Adult Intelligence Scale is not a 
really adequate instrument for psychiatric di- 
agnosis of mental disease. There would seem 
to be, however, enough data now at hand in 
the studies of Wechsler, Rapaport et al., Gras- 
si, and others to form a basis for the construc- 
tion of a new scale, more specifically pointed 
toward the problem of psychiatric diagnosis by 
means of subtest pattern. 


Received September 14, 1949. 
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WECHSLER-BELLEVUE FOR 
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the Wechsler-Bellevue test, although 

standardized and validated successfully 
on a variety of clinical samples, had been found 
by the present writers to be inapplicable to a 
determination of the intellectual capacities of 
mental defectives [1]. The purpose of this 
study was to devise a new abbreviated scale 
which would meet the criteria set up for a valid 
clinical measurement of the intellectual capaci- 
ties of institutionalized mental defectives. Since 
group tests are not feasible with defectives, a 
short individual scale has considerable merit in 
the overcrowded and understaffed institutions 
for the mentally defective where there may be 
pressure for routine testing or where time is 
required for other psychological studies. There 
was need for a scale which would: (a) reduce 
the administration time, (b) measure intelli- 
gence of defectives accurately, (c) be objective 
and easy to score, and (d) retain items of di- 
agnostic significance. 


Pie Wechsi proposed short forms of 


One of the valuable and useful functions of 
the Wechsler-Bellevue test [4] in the diagnosis 
and classification of institutional defectives is 
its provision for separate verbal and perform- 
ance IQ’s in addition to a full scale 1Q. This 
is most important since there is frequently a 
great discrepancy between the verbal IQ and 
the performance IQ, the latter usually being 
the superior ability. Determination of these 
separate abilities not only provides adequate 
clinical information about defectives but is al- 
so of value administratively, since knowledge 
of these two major capacities aids considerably 
in making decisions concerning the kind of 
training received in the institution and in assess- 
ing levels at which defectives of various ages 
could be expected to function. 


This study was undertaken to develop a 


Southbury Scale which would retain the clini- 
cal features of the full Wechsler-Bellevue, 
meet the prerequisites for a sound abbreviated 
test, and also give valid predictions of verbal, 
performance, and full scale IQ’s of mental de- 
fectives. 
SUBJECTS 

Two groups of subjects were used in this 
study: (1) an experimental group of 154 in- 
stitutionalized mental defectives in residence at 
The Southbury Training School, and (2) a 
validating group of 50 mental defectives also 
in residence at the training school. Defectives 
over 40 years of age were excluded since ex- 
perience at Southbury has shown that full 
scale Wechsler 1Q’s obtained on such individ- 
uals are significantly higher than the level es- 
tablished by many other previous measures. 
Epileptics, psychopaths, neurotics, psychotics, 
cerebral palsies, and etiologies other than fa- 
milial or undifferentiated were also excluded. 
The Department of Psychological Service had 
classified each subject as a “high grade” or “bor- 
derline” defective. Within the institutional 
population, a “high grade” defective is defined 
as one who receives a full scale Wechsler- 
Bellevue IQ between 49 and 67, and who has 
failed to make a satisfactory social adjustment 
prior to admission; a “borderline” defective, 
although he obtains a full scale Wechsler- 
Bellevue IQ between 68 and 79, has demon- 
strated, nevertheless, an inability to meet the de- 
mands of society without strict supervision and 
guidance. ‘““Middle” and “low” grades of de- 
ficiency (IQ’s under 49) were excluded since 
the Wechsler-Bellevue test is considered un- 
suitable for these levels of greater deficiency. 
Each subject had been legally judged a mental 
defective in accordance with psychological, so- 
cial, and developmental criteria, and also in re- 
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TABLE 1 
Ace, Versat IQ, PerrorMANnce IQ, AND Futt ScaAte IQ For Sex, Erio.ocy, 
CLASSIFICATION, AND TOTAL VALIDATING Group oF 50 DEFECTIVES 











Full 10 ‘ 





Verbal1Q —_—swPPerf.1Q 





Age 
No. Mean SD Mean SD Mean SD Mean SD 
Sex 
Male , 20 21.42 6.69 64.85 8.07 71.75 10.89 64.80 7.66 
Female........... 30 24.92 7.50 65.53 7.76 72.43 8.97 66.23 7.63 
Etiology 
Familial 37 23.41 7.52 65.62 6.84 72.81 9.88 66.24 6.95 
Undifferentiated 13 23.85 7.12 64.23 10.27 70.31 9.27 64.00 9.25 
Classification 
High 27 6.87 60.85 6.37 66.00 7.92 $9.93 5.27 
Borderline 23 2 5 8.04 70.43 6.21 79.39 6.16 72.39 3.41 
Total 50 23.52 7.44 65.26 7.89 72.16 9.79 65.66 7.68 


spect to actual functioning. Their adjustment 
to institutional living, however, had been rated 
as “reasonably good” or “good.” 

The breakdown in age, verbal 1Q, perform- 
ance IQ, and full scale 1Q, based on ten 
Wechsler-Bellevue subtests, for sex, etiology 
and classification of the experimental group of 
154 subjects has been presented fully in a pre- 
vious paper [1]. Briefly, the mean age of the 
experimental group was 23.50 years with a 
range of 14 to 39 years. The full scale Wechs- 
ler-Bellevue IQ’s ranged from 49 to 79 with a 
mean of 64.40; the mean verbal IQ and the 
mean performance IQ for the experimental 
group were 64.19 and 70.98, respectively. The 
cases are proportionate and representative in 
all respects—sex, age, intellectual levels and 
etiologies—of the trainable population in resi- 
dence at The Southbury Training School [5] 
and are probably representative also of similar 
populations in other institutions for the men- 
tally defective. 

Table 1 gives a similar breakdown for the 
validating group of 50 defectives. TThe numbers 
of subjects in each category are in proportion 
to those of the experimental group. The age 
distribution and the distributions of verbal, per- 
formance and full scale IQ’s are almost exactly 
the same here as in the larger experimental 
group. The mean age of the 50 subjects is 
23.52 years with a range from 15 to 39 years. 
The mean verbal IQ, the mean performance 
IQ, and the mean full scale IQ are 65.26, 
72.16, and 65.66, respectively. These measures 
are only one IQ point higher than the means in 
the experimental group. There are no signifi- 


cant differences between any of the mean val- 
ues of the validating group and the experiment- 
al group, and it is considered that the valid- 
ating group is also proportionate and represent- 
ative in all respects of the trainable defective 
population in residence at The Southbury 
Training School. 


METHODS AND RESULTS 


Experimental Studies. The experimental 
group of 154 institutionalized mental defect- 
ives had been used in the original validation of 
the short scales proposed by other writers. It 
had been suggested by others that these previ- 
ously proposed Wechsler-Bellevue abbreviated 
forms, which had been standardized on various 
clinical groups, were applicable to measurement 
of intellectual levels in the defective range. 
However, the present writers in analyzing the 
data in terms of critical ratios, correlations, 
mean deviations and individual discrepancies 
between the short form scores and the true 
scores had indicated that the scales of Cum- 
mings, Gurvitz, Rabin, and Geil were inapplic- 
able to defectives [1]. One of Geil’s abbrevi- 
ated scales [2] had apparent value for predic- 
tion of full scale IQ’s for defectives, but did 
not show a close relationship to verbal or per- 
formance abilities when these were treated 
separately. 

The data from this experimental group 
were now analyzed for several other two, 
three, and four subtest combinations in the at- 
tempt to find a short scale which gave the best 
discrimination. The independent group of 50 
defectives was used to validate the findings in 
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order to substantiate that the Southbury Scale, 
in addition to meeting all of the prerequisites 
for an abbreviated scale, produced not only ac- 
curate full scale estimates but also valid esti- 
mates of verbal and performance capacities 
separately. 

Those combinations described in this paper 
were selected on the basis of their proximity to 
the average of the means of the weighted 
scores of the ten subtests and on their clinical 
significance for mental defectives [1]. Em- 
ploying the same procedures as in the valida- 
tion of previous scales, the following combina- 
tions were tested: (1) D-DS, (2) S-BD, (3) 
I-C-BD, (4) I-S-D8, and (5) C-S-PA-BD. 
Since a systematic analysis of the relationship 
between two psychometric instruments re- 
quires more than a mere determination of the 
degree of correlation betwen them, analysis 
was first made of differences in terms of the 
critical ratios between the distributions of the 
predicted and full values for each combination. 
Table 2 shows the mean full scale IQ, the 
predicted means of these short scales, their 
standard deviations, and the critical ratios be- 
tween the distributions. 


Critical ratios based upon the weighted 
score distributions differ little from the IQ 
distributions. Comparison of these results with 
those found in the validation of previously 
proposed scales revealed that any one of the 
new scales has a distribution as similar to the 
true scores as that of other proposed scales. 
Previously it had been pointed out that a non- 
significant critical ratio along with a high degree 
of correlation between the distributions might 
indicate predicted scores of great value. The 
Pearsonian coefficients of correlation of these 
new scales based upon the weighted scores (in 
order to facilitate direct comparison with re- 
sults given by other authors) follow: (1) 


548, (2) .738, (3) .762, (4) .576, and (5) 
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862. 
The last scale (C-S-PA-BD) showed the 


most promise: its predicted mean is only one 
IQ point higher than the full mean; it has a 
nonsignificant critical ratio; and finally, it has 
the highest of all the correlations (r = .862, 
PE = .014). This abbreviated form, the 
Southbury Scale, was analyzed more system- 
atically and indicated still further that it met 
well the criteria set up for a discriminative 
short test for mental defectives. The analysis 
of the deviations of the predicted from the ac- 
tual scores showed a mean deviation of 4.45 
IQ points, with 58.4 per cent of the cases 
varying from 0 to 4 points, 32.5 per cent of 
the cases varying from 5 to 9 points, and only 
9.1 per cent deviating between 10 and 14 
points. These results as a whole were superior 
to those found in the validation of Geil’s scale 
[1]. 

As had been noted above, it is necessary that 
a short form of the Wechsler, to be of signifi- 
cance for mental defectives, must not only be 
highly discriminative for full scale IQ predic- 
tions, but must also give an accurate estimate 
of the defective’s verbal and performance ca- 
pacities independently. The correlation be- 
tween the four items of the Southbury Scale 
and the full scale weighted scores was found 
to be high. If now the two verbal subtests of 
the abbreviated form correlate highly with 
the full verbal weighted scores, and also if the 
two performance subtests correlate highly with 
the performance weighted scores, the South- 
bury Scale could then be considered to be of 
great clinical and administrative value for the 
prediction of the abilities of mental defectives. 
These Pearsonian correlations were found to 
be: verbal r—=.784 and performance r= .839. 
In a population whose range is curtailed, cor- 
relations of .784, .839, and .862 for verbal, 
performance and full scale determinations 


TABLE 2 
MEANS AND STANDARD DeviATIONS OF FULL SCALE IQ’s Prepicrep From SOoUTHBURY SHORT Forms 


AND CRITICAL RATIOS BETWEEN THE PREDICTED AND FULL ScALe IQ’s 














(1) 


Southbury Short Forms 





(2) (3) (4) (5) 
Full Scale 1Q D-DS S-BD I-C-BD I-S-DS C-S-PA-BD 
Mean 64.40 65.29 64.14 64.64 63.44 65.47 
Se 7.66 12.45 11.27 9.90 8.50 9.92 
a 46 24 .23 1.04 1.03 
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from a single administration of a short scale 
can be considered quite good. 

With the above correlations, regression 
equations for the verbal, performance, and full 
scale weighted score data were obtained [3]. 
These are: 


Verbal Y=1.52X + 4.08 
Performance Y = 1.80 X + 10.30 
Full Scale Y=1.95X¥ + 9.02 


These regression equations tend to raise the 
predictions below the mean and lower those 
above the mean. New equivalent verbal 
weighted scores based on the sum of the two 
verbal short scale items (C + S), new equiva- 
lent performance weighted scores based on the 
sum of the two performance short scale items 
(PA + BD), and new equivalent full scale 
weighted scores based on the sum of the four 
short scale subtests (C + § + PA + BD) 
were computed to the nearest whole number 
from the regression equations. These values 
are given in Table 3. The IQ equivalents of 
these new weighted scores, taking age into ac- 
count, can be obtained from the appropriate 


Wechsler tables [4]. 


An example of the use of the Southbury 
weighted scores table may be the following: 
a 16-year-old subject receives weighted scores 
of 6 on Comprehension, 4 on Similarities, 7 
on Picture Arrangement, and 6 on Block 
Design. This subject’s verbal sum is 10, 
which is equivalent to a weighted score of 
19 (Table 3), and this gives him a verbal 1Q 
of 64. His performance sum is 13, which is 
equivalent to a weighted score of 34 (Table 
3), and a performance IQ of 75. His four 
subtest scores equal 23, which gives him a full 
scale weighted score of 54 (Table 3) and an 
equivalent full scale IQ of 67. Theoretically, 
full scale 1Q’s can be computed from the 
Southbury data only through an IQ of 79, 
which is the top level of the defective popula- 
tion studied. However, verbal and perform- 
ance IQ’s may be computed above this value 
since a defective with a full scale IQ under 
80 may score above that in either verbal or 
performance abilities. 


Further analysis revealed that when new 
weighted score values were assigned from the 
regression equation data, there were no signifi- 
cant differences between the distributions of 
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TABLE 3 
EQUIVALENT VERBAL, PERFORMANCE, AND FULL SCALE 
WEIGHTED SCoRES FOR SUBTESTS OF THE 
SOUTHBURY SCALE 
Performance 
Weighted Scores 


Full Scale 
Weighted Scores 
c+s- 


Weighted Scores 





C+S Equiv PA+ BD Equiv. PA+ BD Equiv. 
1 6 1 i2 8 25 
2 7 2 14 9 27 
3 9 3 16 10 28 
a 10 + 18 11 30 
5 12 5 19 12 32 
6 13 6 21 13 34 
7 15 7 23 14 36 
8 16 8 25 15 38 
9 18 9 26 16 40 
10 19 10 28 17 42 
1 21 11 30 18 44 
12 22 12 32 19 46 

13 24 13 34 20 48 
14 25 14 35 21 50 
15 27 15 37 22 §2 
16 28 16 39 23 54 
17 30 17 41 24 56 
18 31 18 43 25 58 
19 33 19 44 26 60 
20 34 20 46 27 62 
21 36 21 48 28 64 
22 38 22 50 29 66 
23 39 23 §2 30 68 
24 40 24 54 31 69 





the actual and newly predicted values for the 
verbal, performance and full scale measures. 
In line with the need for more systematic 
analyses, and in view of the practical consider- 
ation for the use of a short form for mental 
defectives, a study was made of the amount of 
deviation of each suject’s predicted IQ score 
from his actual score. This analysis revealed 
how closely the predicted values obtained by 
means of the Southbury Scale approached the 
1Q’s found in administration of ten Wechsler 
subtests. The four-item Southbury Scale over- 
predicted and underpredicted true values al- 
most an equal number of times and almost to 
the same degree. The average amounts of de- 
viation were 3.86, 5.30, and 3.41 IQ points 
for the verbal, performance and full scale de- 
terminations, respectively. Analysis of the 
cases in which the amount of deviation lay be- 
tween 0 and 4, 5 and 9, and 10 or more IQ 
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TABLE 4 
MEANS AND STANDARD DevIATIONS OF VERBAL, PERFORMANCE, AND FULL SCALE WEIGHTED 
Scores AND IQ’s PrepicTreD FROM SOUTHBURY SCALE AND CRITICAL 


RATIOS BETWEEN THE PREDICTED AND FULL VALUES 











~ Full Scale 





Verbal Performance 
Wr. Sc. IQ Wt. Sc. IQ Wt. Sc. IQ 

Predicted Values 

ae 18.46 65.14 28.40 71.92 47.00 65.42 

as 4.28 6.67 6.84 8.82 11.97 8.46 
Full Values 

ae 18.60 65.26 28.52 72.16 47.24 65.66 

a 6.50 7.89 8.20 9.79 12.05 7.68 
Critical Ratio 12 .08 13 13 10 15 





points showed that 66.2 per cent of the verbal 
predictions, 47.4 per cent of the performance 
predictions, and 70.8 per cent of the full scale 
predictions are less than 5 IQ points away 
from their true values and that 93.5, 89.0, and 
98.1 per cent are less than 10 IQ points devi- 
ant. This appears to be as good as one could 
expect in a test-retest situation where one is 
assessing, through one administration, verbal, 
performance, and full scale IQ’s for mental 
defectives. 


VALIDATING STUDIES 


In order to validate the Southbury Scale, an 
entirely independent group of 50 mentally de- 
ficient subjects was administered the ten sub- 
tests (vocabulary excluded) of the Wechsler- 
Bellevue Scale under the usual standardized 
conditions. As noted above, this validating 
group of 50 representative defectives is equated 
in all respects with the original experimental 


group (Table 1). 


It was first necessary to determine whether 
these equated groups were similar in their abil- 
ities on each of the four items of the proposed 
scale. The weighted score means and standard 
deviations of the four Wechsler subtests under 
consideration and the critical ratios between 
the weighted score distributions of the two 
groups showed that there are no significant dif- 
ferences. The means of each subtest are less 
than one-half point apart and the critical ra- 
tios range from .32 to .93. 

The same procedures and criteria employed 
in the validation of previously proposed scales 
with mental defectives were utilized in the 
present validation of the Southbury Scale. 


Table 4 compares the results of the verbal, 
performance and full scale values predicted 
from the separate regression equations with 
the actual results of the administration of the 
full Wechsler-Bellevue ; the table also gives the 
critical ratios between the distributions. 

In this validating population there is no sig- 
nificant difference between the predicted 
weighted scores and the actual weighted 
scores for each of the three determinations. 
The same is true of the distributions of the 
verbal, performance, and full scale I[Q’s. The 
first test of the validity of the Southbury Scale 
was adequately met. 


A second validating criterion, the correla- 
tion between the various distributions, was 
next evaluated. Pearsonian coefficients of cor- 
relation were run between the new weighted 
scores predicted from the two verbal items and 
the values of the five Wechsler verbal sub- 
tests, between the predicted equivalent values 
from the two performance items and the five 
Wechsler performance items, and also between 
the newly predicted weighted values of the 
four subtests and the full Wechsler scale. 
These correlations were: verbal r = .875, PE 
= .022; performance r= .914, PE = .016; 
and full scale r = .936, PE = .012. These 
correlations are somewhat higher than, but in 
the same direction as, the correlations obtained 
in the experimental population. Coefficients 
of correlation obtained between predicted IQ’s 
and actual IQ’s differed little from these 
weighted score values. The nonsignificant 
critical ratios along with the high degree of cor- 
relation between the various distributions indi- 
cate predicted scores of great value. 
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The final criterion for the validity of the 
Southbury Scale, the practical analysis of the 
amount of deviation between predicted IQ’s 
and true 1Q’s, was made in terms of the mean 
individual deviations and the percentage of 
cases that differed between 0 and 4 1Q points, 
between 5 and 9 IQ points, and more than 10 
IQ points above or below the original verbal, 
performance, and full scale IQ’s. The mean 
deviations in IQ points were 2.72, 4.24, and 
2.80 for verbal, performance and full scale 
values, respectively. These average amounts 
of deviation are again less in the validating 
group than they were in the experimental 
group. In addition, the deviation for the full 
scale measure is less here than it was for any 
of the previously proposed scales. Table 5 
shows the percentage of cases which deviated 
between 0 and 4, 5 and 9, and 10 or more IQ 
points for each of the three measures. 


TABLE 5 
Mean IQ DEVIATION AND PERCENTAGE OF CASES 
DeVIATING 0-4, 5-9, AND More THAN 10 IQ 
Points ABOvE oR BeLow THE AcTuAL IQ 
(VALIDATING GROUP) 


Percentage Deviating 


South- Mean 
bury Deviation More than 
Seale (1Q Points) 0-4 5-9 10 IQ 
Verbal 2.72 84% 14% 2% 
Perform- 

ance 4.24 E4A% 42% 4% 
Full 2.80 82% 18% 0% 





As can be seen, 98 per cent of the verbal pre- 
dictions, 96 per cent of the performance pre- 
dictions, and 100 per cent of the full scale pre- 
dictions, deviated less than 10 IQ points from 
the full scores. As a matter of fact, most of the 
cases deviate by 4 or less IQ points. These re- 
sults with the validating group are apparently 
better than those of other scales and indicate 
that the Southbury Scale can be used with as- 
surance that the majority of cases will not de- 
viate beyond a test-retest chance differential. 
In any instance where there may be a deviation 
of more than 9 points from previously estab- 
lished levels, the remaining six Wechsler sub- 
tests should be administered in order to deter- 
mine whether any change in the intellectual 
status of the subject has actually taken place. 
No short scale should be used in an initial di- 
agnosis of mental deficiency. 

In the study of the validity of previously 
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proposed short forms of the Wechsler, it had 
been found that one of Geil’s scales had value 
for prediction of full scale IQ’s of mental de- 
fectives, but was not applicable for separate 
verbal or performance determinations. Since 
it was possible that this scale might show bet- 
ter results with the present group of 50 de- 
fectives than it had with the experimental 
group, a recheck was made, utilizing all of the 
validity criteria. The results showed that, al- 
though the three verbal items of Geil’s scale 
are more closely related to the full verbal 
scores, the predictions for performance levels 
and full scale determinations are not as good; 
for example, a large percentage of the predicted 
performance IQ’s fall beyond 10 points from 
the true IQ’s. In general, it is considered that 
for prediction of verbal, performance, and full 
scale measures of the intellectual capacities of 
mental defectives, the Southbury Scale as a 
whole is definitely superior to the Geil scale 
when both are validated on the same popula- 
tion. 
DISCUSSION AND SUMMARY 


A previous study of the validity of abbrevi- 
ated Wechsler-Bellevue scales for mental de- 
fectives had shown that there still was a need 
for a short form which would not only fulfill 
the criteria for a sound, brief test, but would 
also give valid predictions of verbal, perform- 
ance and full scale IQ’s of defectives. Previ- 
ously proposed scales, although valid for other 
clinical groups, had been shown to be inap- 
plicable to an accurate measurement of the in- 
tellectual capacities of mental defectives. The 
present study introduces the Southbury Scale, 
which is considered to meet most adequately 
all the prerequisites for a valid short test for 
defectives. 

1. The subjects were 204 high grade or 
borderline mental defectives of familial or un- 
differentiated etiology, 154 in the experimental 
group and 50 in the validating group. Each 
group was representative of the trainable in- 
stitutional population at The Southbury Train- 
ing School. 

2. The full Wechsler-Bellevue (vocabulary 
excluded) was administered to each subject. 
Various combinations of subtests were ana- 
lyzed statistically for similarity of distribu- 
tions, high correlations, small mean deviations, 
and acceptable individual discrepancies. The 
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Southbury Scale, consisting of Comprehen- 
sion, Similarities, Picture Arrangement, and 
Block Design, was found to be the most dis- 
criminative form for verbal, performance, and 
full scale determinations. 

3. New weighted scores were computed 
through regression equations. On the basis of 
these new scores the percentage of individual 
cases deviating within acceptable test-retest 
limits for defectives was very high. 

4. A validation study on the group of 50 
subjects revealed no significant distribution 
differences, high correlations, small mean de- 
viations, and negligible discrepancies between 
the verbal, performance, and full scale deter- 
minations on the Southbury Scale and the same 
measurements on the complete Wechsler- 
Bellevue. 

5. The Southbury Scale with its table of 
weighted scores is presented as a valid and clin- 
ically diagnostic test of the intellectual capaci- 
ties of mental defectives. Through a single ad- 
ministration of its four subtests, the scale pro- 
vides reliable and close estimates not only of 
full scale IQ’s but also of verbal and perform- 
ance determinations separately. Although the 
Southbury Scale lacks the clinical advantages 
of the full Wechsler examination, it still re- 
tains items of diagnostic significance. The scale 
takes approximately 25 minutes to administer 
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and provides ample opportunity for observation 
of the subject’s behavior and emotional reac- 
tions. Ihe time saved by the use of this ab- 
breviated scale can be utilized by the clinician 
in exploring personality dynamics or special 
abilities which may be of greater importance 
than the general intellectual level. Its value is 
high in overcrowded and understaffed institu- 
tions and clinics. 

6. The Southbury Scale is a highly service- 
able measure of the intellectual capacities of 
mental defectives. Its success with defectives 
should not be assumed for other clinical groups 
without further investigation. 


Received December 5, 1949. 
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A STUDY OF THE TWO FORMS OF THE WECHSLER- 
BELLEVUE INTELLIGENCE SCALE 


RENATE GERBOTH 


ALTON STATE HOSPITAL, ALTON, ILLINOIS 


Bellevue Intelligence Scale Form I and 

Form II has not been clearly estab- 
lished, and no comparative study between 
these two forms has been published. It is appar- 
ent that most of the research done with the 
Wechsler-Bellevue has been in the clinical 
field, establishing scatter patterns of schizo- 
phrenics, intelligence of narcotic drug addicts, 
and performance of problem children. In these 
studies, Form I of the test has been used a 
major portion of the time. A comparative 
study was conducted in order to discover the 
amount of practice effect obtained through re- 
testing, the difference between the mean scores 
of the three scales and eleven subtests of the 
two forms, and finally to determine their in- 
tercorrelation, and their correlation with the 
American Council of Education Psychological 
Examination scores, and with college grades.” 


, ‘HE relationship between the Wechsler- 


PROCEDURE 


The subjects for this experiment were 100 
Washington University students, fifty males 
and fifty females. They were evenly dis- 
tributed between the ages of seventeen to 
twenty-four, therefore using two of Wechs- 
ler’s intelligence quotient categories: 17-19 
and 20-24 years of age. The median age of 
these students was twenty. 

The students’ university classifications 
ranged from freshmen to graduate students, 
their median falling in the sophomore class. 
About two thirds of the students were en- 
rolled in beginning psychology classes. 

During the first semester each subject was 
given a Wechsler-Bellevue test by the ex- 
aminer. All eleven subtests were administered 


1This experiment was conducted under the direc- 
tion of Associate Professor Winifred K. Magdsick, 
Washington University, in partial fulfillment of the 
requirements for the degree of Master of Arts. 
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to each student. Approximately four months 
later each subject was retested, using the other 
form of the Wechsler-Bellevue. ‘Twenty-five 
men and twenty-five women were given Form 
I first and Form II upon retesting; and the 
others were given Form II first and then Form 
I. All tests were administered under the same 
conditions in the psychological laboratory at 
Washington University. 

The scores on these tests were correlated 
with each student’s grade point average at the 
end of the first semester of 1947-1948, and 
with their American Council on Education 
Test scores, which seventy of the one hundred 
students had taken upon entering Washington 
University. Since the students entered college 
at various times, no one edition of the ACE 
had been used, and since not all of the raw 
scores were available, percentile norms had to 


be used. 


RESULTS AND DISCUSSION 


Practice Effects. The means were computed 
for the three scales, Verbal, Performance, and 
Full Scales, obtained at the first administra- 
tion, for both Form I and Form II. These 
means were compared with the means computed 
from the scores of the three scales obtained at 
the second administration. The differences be- 
tween means indicated the amount of practice 
effect obtained through retesting, using alter- 
nate forms. The critical ratios and their sig- 
nificances were calculated. These results are 
shown on Table 1. 

As can be seen from Table 1, when Form I 
was given first the practice effect was always 
so negligible that it was not significant. How- 
ever, when Form II was given first, the prac- 
tice effect became highly significant. The Per- 
formance Scale, especially, showed an increase 
of almost seven points upon retesting with 
Form I. Therefore, when Form II is given 
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TABLE 1 








First Administration 
Mean and SD 





DIFFERENCES BETWEEN First AND SECOND ADMINISTRATION OF Forms I AND II 


Second Administration 














Verbal I* 124.42 (6.16) 
II 119.36 (8.11) 
Performance I 122.12 (7.51) 
II 117.30 (7.54) 
Full Scale I 125.60 (7.61) 
II 120.88 (6.58) 





*I refers to the Wechsler-Bellevue Scale, Form I, and II refers to Form 


first and Form I is used for retesting, it is sug- 
gested that about five points should be sub- 
tracted from the retest scores, but no changes 
of scores are indicated when Form II is used 
for retesting. Significant differences of this 
magnitude are apparently due to practice ef- 
fect and not to the increase in chronological 
ages of the subjects between the first and sec- 
ond administration. This finding agrees with 
that of Lowell [3], who eliminated chrono- 
logical age differences as a cause of variation 
of the IQ, when retesting 3,000 children two, 
three, and four times. 

When the scores of the men and women 
were tabulated separately, sex difference was 


II 
I 


Mean and SD Dif CR __ Signif. 
122.36 (7.32) —2.06 1.51 (low) 
123.38 (8.18 4.02 2.44 (.01) 
123.24 (6.32) 1.12 0.80 (low) 
124.22 (6.95 6.92 4.71 (.01) 
125.31 (6.60) —0.29 0.22 (low) 
126.17 (5.72) 5.29 4.26 (.01) 








Il. 


noticeable in practice effect (Table 2). The 
men showed a greater amount of practice ef- 
fect than the women whenever Form I was 
given first. This, however, may be due to sam- 
pling. 

The men also had a higher Full Scale mean 
than the women on Forms I and II of the first 
administration, and on Form I of the second 
administration. On the Verbal Scale, the dif- 
ference was even more pronounced in favor of 
the men, but on the Performance Scale women 
did about one point better than men. This dis- 
crepancy is at variance with other studies of 
intelligence differences between men and wo- 
men. Men usually do better on practical and 


TABLE 2 
DIFFERENCES BETWEEN FIRST AND SECOND ADMINISTRATION OF Forms I Anp II, 








Mean and SD 











Men (N — 25) 


Verbal I 125.64 (5.92) 
II 120.88 (7.14) 
Performance I 121.04 (9.54) 
II 117.16 (6.94) 
Full Scale I 125.68 (9.64) 
II 121.48 (6.14) 


Women (N —=25) 


Verbal I 123.20 (6.41) 
II 117.84 (9.09) 
Performance I 123.20 (5.48) 
Il 117.44 (8.15) 
Full Scale 5 ...935.S2 .:(5.59) 
II 120.28 (7.03) 


Mean and SD Dif. CR _Signif. 
123.40 (8.72 —2.24 1.06 (low) 
124.64 (7.81 3.76 1.78 (.10) 
122.84 (6.56) 1.80 0.77 (low) 
123.40 (6.17) 6.24 3.35 (.01) 
125.26 (8.16) —0.42 0.16 (low) 
126.56 (5.25) 5.08 3.16 (.01) 
121.32 (5.93) —1.88 1.07 (low) 
122.12 (8.55) 4.28 1.71 (.10) 
123.64 (6.09) 0.44 0.27 (low) 
125.04 (7.74) 7.60 3.38 (.01) 
125.36 (5.05) —0.16 0.11 (low) 
125.78 (6.19) 5.50 2.93 (.01) 








TWO FORMS OF THE WECHSLER-BELLEVUE 


367 














TABLE 3 
DIFFERENCES BETWEEN THE SUBTESTS OF Forms I AND II 
First Second 
Administration Administration 

Mean and SD Mean and SD Dif. CR Signif. 
Information I 13.64 (1.67) II 13.32 (1.68) —0.32 0.95 (low) 
II 12.76 (1.74) I 13.48 (1.53) 0.72 2.16 (.05) 
Comprehension I 13.88 (1.97) II 12.08 (1.13) —1.80 5.77 (.01) 
II 12.56 (1.47) 13.38 (1.26) 0.82 2.96 (.01) 
Arithmetical Reasoning 13.08 (3.10) II 12.10 (3.17) —0.98 1.58 (low) 
II 10.66 (2.61) 13.28 (3.08) 2.62 460 (01) 
Digit Span 11.30 (3.17) II 11.66 (2.98) 0.36 0.58 (low) 
II 11.44 (3.36) 11.58 (3.12) 0.14 0.21 (low) 
Similarities I 15.04 (1.14) II 13.28 (1.57) ——1.76 6.28 (.01) 
II 12.94 (1.93) I 14.32 (1.77) 1.38 3.54 (.01) 
Vocabulary 13.54 (1.21) II 13.78 (1.96) 0.24 0.73 (low) 
II 13.44 (1.53) 13.26 (1.15) —0.18 0.69 (low) 
Picture Arrangement 11.08 (2.33) II 13.02 (2.22) 1.94 4.24 (:01) 
II 11.84 (2.80) 12.24 (2.68) 0.40 0.73 (low) 
Picture Completion 13.44 (1.31) II 12.28 (1.18) —1.16 4.46 (.01) 
II 11.60 (1.97) I 13.76 (1.84) 2.16 5.84 (.01) 
Block Design 14.84 (2.13) II 14.92 (2.28) —0.12 0.28 (low) 
II 13.94 (1.98) I 15.12 (2.07) 1.18 2.95 (.01) 
Object Assembly 13.42 (1.95) II 13.44 (2.09) 0.02 0.05 (low) 
II 12.98 (1.84) I 13.46 (1.62) 0.48 1.37 (low) 
Digit Symbol I 13.84 (1.38) II 14.24 (1.67) 0.40 1.25 (low) 
II 13.74 (1.05) I 13.84 (1.28) 0.10 0.45 (low) 





arithmetic problems, and women do better on 
language and verbal tests [8], contrary to our 
findings. Other psychologists, for example Ter- 
man [7] and Lowell [3], state that sex has no 
bearing upon IQ differences. Wechsler be- 
lieves that the differences cancel each other out 
in the Full Scale. They do not do so fully, 
however, since Wechsler [8] reports that the 
intelligence of women is higher at almost every 
age level. Our study does not corroborate the 
superior intelligence of women, but this may 
perhaps be explained on the basis of sampling. 


The difference between the first and second 
administration of the eleven subtests, desig- 
nating practice effect, is reported in Table 3. 
Practice effect can be found only in five of 
the eleven subtests when Form I is adminis- 
tered first, and only one of these is significant 


(Picture Arrangement). When Form II is 
given first, ten of the subtests show positive 
practice effect, six of which are significant, and 
only Vocabulary shows none. The subtest 
showing the highest amount of practice effect 
when Form I is followed by Form II, is Pic- 
ture Arrangement. When Form II is admin- 
istered first, Arithmetical Reasoning and Pic- 
ture Completion rank highest. Since the amount 
of practice effect varies on the different forms, 
it is most probably due to the ease or difficulty 
of the various subtests, rather than to the gen- 
eral familiarity of the material to the subjects. 


Scatter Pattern. A noticeable amount of 
variability between the means of the subtests 
on each form of the test is also shown in Table 
3. The scatter pattern of these means indi- 
cates a range from 11.08 to 15.04 on the first 
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TABLE 4 
DIFFERENCES BETWEEN Forms I ANp II 
Form I To ‘Form II , 

Mean and SD Mean and SD Dif. CR __Signif. 

Verbal Ist* 124.42 (6.16) 119.36 (8.11) —5.06 3.46 (.01) 
2nd 123.38 (8.18) 122.36 (7.32) —1.02 0.65 (low) 

Performance Ist 122.12 (7.51) 117.30 (7.54) —4.82 3.17 (.01) 
2nd 124.22 (6.95) 123.24 (6.32) —0.98 0.73 (low ) 

Full Scale Ist 125.60 (7.61) 120.88 (6.58) —4.72 3.28 (.01) 
2nd 126.17 (5.72) 125.31 (6.60) sss 0.86 0.70 (low) 

*lst indicates first administration, 2nd is second administration. 


administration of Form I, and from 10.66 to 
13.94 on the first administration of Form II. 
On the second administration they ranged from 
11.58 to 15.12 on Form I, and from 11.66 to 
14.72 on Form II. Since the means of 50 cases 
were used, individual differences were can- 
celed out, the means being more constant than 
the individual scores. These results disagree 
somewhat with Rapaport’s statement that a 
well-adjusted person should show little dis- 
crepancy among the eleven subtests, and that 
any scatter may be evidence of impairment [6]. 
The subjects of this experiment were appar- 
ently well-adjusted college students. Wechsler 
[8] disagrees with Rapaport in stating that 
among superior adults, verbal tests are often 
higher than performance tests. Wechsler also 
gives a table [8, p. 222] showing the means of 
subtests obtained from normal subjects which 
display a pattern similar to ours. Estes [2] also 
found considerable scatter when testing 102 


college students with the Wechsler-Bellevue. 
He believed that he had obtained a picture of 
normal scatter, in which most of the means of 
the performance tests were significantly lower 
than the means of the verbal tests. His results 
support our conclusions. 

Differences Between the Scores of Form I 
and Form II. To ascertain the significance of 
the difference between the results of Form I 
and Form II, the means of the different scales 
of Form I and Form II were computed sep- 
arately, according to whether they were ob- 
tained from the first or second administration 
of the test. 

This analysis is shown on Table 4. Here one 
can see that on the first administration, Form 
I is about five points higher than Form II on 
all three scales. However, on the second ad- 
ministration, the differences between the means 
on Forms I and II are never significant. 
Wechsler [9] found a mean difference of less 














TABLE 5 
DIFFERENCES BETWEEN THE SUBTESTS OF Forms I AND IT 
Form I Form II 

Mean and SD Mean and SD Dif. CR __Signif. 
Information 13.64 (1.67) 12.76 (1.74) —1.08 3.08 (.01) 
Comprehension 13.88 (1.97) 12.56 (1.47) —1.32 3.77 (.01) 
Arithmetical Reasoning 13.08 (3.10) 10.66 (2.61) —2.42 4.24 (.01) 
Digit Span 11.30 (3.17) 11.44 (3.36) 0.14 0.22 (low) 
Similarities 15.04 (1.14) 12.94 (1.93) —2.10 6.37 (.01) 
Vocabulary 13.54 (1.21) 13.44 (1.53) —0.10 0.56 (low) 
Picture Arrangement 11.08 (2.33) 11.84 (2.80) 0.76 1.38 (low) 
Picture Completion 13.44 (1.31) 11.60 (1.97) —1.84 5.26 (.01) 
Block Design 14.84 (2.13) 13.94 (1.98) —0.90 2.19 (.05) 
Object Assembly 13.42 (1.95) 12.98 (1.84) —0.44 1.13 (low) 
Digit Symbol 13.84 (1.38) 13.74 (1.05) —0.10 042 (low) 
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than two points between the Full Scales of the 
two forms. His suggested adjustments, to be 
added to Form II for equating the two forms, 
do not seem sufficient in the light of the pres- 
ent study. A similar pattern appears when con- 
sidering men and women separately. 

A consideration of the eleven subtests pre- 
sents a very similar picture. Table 5 summar- 
izes these data. All but five subtests show a 
significant difference between Form I and 
Form II. The remaining six subtests yield a 
significantly higher score on Form I. Digit 
Span and Vocabulary show the least difference 
between the two forms, which is the ideal re- 
lationship between two forms of a test. 

Again our results disagree somewhat with 
Wechsler’s. He states that the Comprehension 
subtest is the only subtest on Form II that is 
significantly more difficult than that on Form 
I; and that the Object Assembly subtest is 
easier on Form II [9]; this latter subtest 
however, in our study, has a lower mean on 
Form II than on Form I. 


Reliability. One way of estimating the “reli- 
ability” of a test is to obtain the correlation 
between two equivalent forms. Therefore the 
correlations between the first administration of 
Form I with the second administration of 
Form II, using 50 people, and vice versa, were 
computed, employing the Verbal, Performance, 
and Full Scales. 


The computations reveal that the correlation 
between the Verbal Scale of Forms I and II 
is .86. The correlation when Form II is given 
first is .78. The Performance Scale’s correla- 
tion of Forms I and II is .71; but when giving 
Form II first and then Form I, the correlation 
is only .52. The Full Scale, Forms I and II, 
has a correlation of .79; giving Form II first 
the correlation is .73. From these results it can 
be seen that the correlation between Forms I 
and II is consistently higher when Form I is 
administered first. It would seem better to 
give Form I first and Form II upon retesting. 
These results are probably related to the fact 
that the practice effect obtained by retesting is 
virtually cancelled out by the greater difficulty 
of Form II, if administered in the suggested 
order. 


It is interesting to note that the correlation 
between the two forms of the Wechsler- 
Bellevue greatly falls short of the correlation 
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obtained between the two forms of the Stan- 
ford-Binet, which varies between .90 and .98, 
according to the different IQ levels [7]. Also 
Form L and Form M yield the same scores 
upon retesting, when adjusted for practice ef- 
fect, as suggested by Terman [7]. This, too, 
cannot be said of the two forms of the Wechs- 
ler-Bellevue, even from Wechsler’s own data. 


The Verbal Scales of the Wechsler, Forms 
I and II, have a much higher correlation than 
the Performance Scales; the Full Scale reli- 
ability coefficient gives a padded mean of the 
two other scales. Rabin found a correlation of 
.84 upon retesting sixty mental patients with 
the Wechsler-Bellevue, Form I, after 13 
months [4]. Rabin also retested thirty schizo- 
phrenics and thirty nonschizophrenics after a 
one year interval; he then obtained a r = .55 
for the schizophrenic patients and a r = .89 
for the nonschizophrenics. He found, as was 
also found in the present study, that the Verbal 
Scale shows a much greater reliability than the 
Performance Scale [5]. 


Correlation of Grades with the Full Scale. 
To determine the predictive value of the 
Wechsler-Bellevue, the Full Scale was corre- 
lated with the grade point average the sub- 
jects had earned at the end of the first semes- 
ter of 1947 and 1948. The 100 subjects were 
again divided into two groups of fifty each, 
according to the order in which Form I or 
Form II was taken. The correlation of grade 
points with the test scores of Form I was 0.29, 
+ 0.09, which was significant at the 0.05 
level. The correlation of grade point averages 
with Form II was 0.26, + 0.09, which is al- 
most significant at the 0.05 level of confidence. 
From these results it seems apparent that 
Form I has a slightly greater predictive value 
than Form II, although the difference is only 
.03 points. 


Correlation of the ACE with the Verbal 
and Full Scales. When computing this corre- 
lation, percentile norms had to be used for the 
American Council on Education tests, as pre- 
viously explained. The results were that the 
correlation between the Verbal Scale of Form 
I and the ACE scores was .57, = .04. With 
the Full Scale the ACE scores have a correla- 
tion of .58, + .04. The ccrrelation between 
the Verbal Scale of Form II and the ACE 
scores was only .33, + .01. Using the Full 
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Scale of Form II and the ACE scores, the cor- 
relation was .34, + .01. The ACE scores and 


the grade point averages have a correlation of 
41, = 01. 


SUMMARY AND CONCLUSION 


This comparative study was conducted to 
discover the amount of practice effect obtained 
through retesting, the difference between the 
mean scores of the three scales and eleven sub- 
tests of the two forms, and finally to determine 
their correlation with each other, with the 
ACE scores, and with college grades. 

Our experiment tentatively answers several 
questions, which are summarized below: 

1. The practice effect from the first to the 
second administration, Form I given first, is 
never significant. However, when Form II is 
given first, the practice effect is highly signifi- 
cant. By subtracting about five points of 1Q 
from the retest scores, the two results miay be 
equalized. 

2. The practice effect in the subtests, when 
Form I is given first, is significant only in the 
Picture Arrangement subtest. But when Form 
II is given first, 10 of the 11 subtests show 
practice effect, Arithmetical Reasoning and 
Picture Completion ranking highest. 

3. The sex difference was significant. Con- 
trary to the results of other studies, the men 
made a higher score on the Verbal Scale, and 
the women a higher score on the Performance 
Scale. On the Full Scales the difference is in- 
significant. 

4. The scatter pattern found in this study 
is similar to that found by Wechsler and 
Estes, but is contradictory to Rapaport’s 
statements. 

5. There exists a significant difference on the 
first administration between Form I and Form 
II of about five points on each of the three 
scales, Form I always being the higher one. 
When considering men and women separately, 
a similar difference is found. 

6. Subtests show significant differences: six 
subtests yield a significantly higher mean on 
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Form I, first administration; the remaining 
five show no significant difference between the 
two forms. 

7. The correlation between the two forms 
is higher when Form I is given first (.79) 
than when Form II is given first (.73). There- 
fore, Form I should be used first, and Form 
II for retesting. The Verbal Scales (.86, .78) 
correlate much more highly than the Perform- 
ance Scales (.71, .52). 

8. Grade point averages correlate very 
slightly more highly with the Full Scale of 
Form I (.29) than with that of Form II (.26). 

9. The correlation of the ACE percentiles 
with the Verbal and Full Scales yields a much 
higher r with Form I (.57, .58) than with 
Form II (.33, .34). 


Received November 9, 1949. 


REFERENCES 


1. AwNperson, E. E., et al. A Comparison of the 
Wechsler-Bellevue, Revised Stanford-Binet, and 
ACE the college level. J. Psychol., 
1942, 14, 317-326. 

2. Estes, S. G. Deviations of Wechsler-Bellevue 
subtest scores from vocabulary level in superior 
adults. J. abnorm. soc. Psychol., 1946, 41, 226- 


tests at 


228. 

3. Loweit, Frances E. A study of the variabil- 
ity of IQ’s in retests. J. appl. Psychol., 1941, 
25, 341-356. 

4. Rapin, A. I. Test constancy and variability in 
the mentally ill. J. gen. Psychol., 1944, 31, 
231-239. 

5. Rapin, A. I. The use of the Wechsler-Bellevue 


scales with normal and abnormal persons. Psy- 
chol. Bull., 1945, 42, 410-422. 

6. Rapaport, D., et al. Manual of diagnostic psy- 

chological testing, I. Diagnostic testing of in- 

telligence and concept formation. New York: 

Josiah Macy, Jr. Foundation, 1944. 

TerRMAN, L. M., AND MERRILL, MAupe A. Meas- 

uring intelligence. Boston: Houghton Mifflin, 

1937. 

8. WecHsLter, D. The measurement of adult in- 
telligence. (3rd Ed.) Baltimore: Williams & 
Wilkins, 1944, 

9. Wecuster, D. The Wechsler-Bellevue intelli- 
gence scale, Form II. New York: Psychological 
Corp., 1946. 


“I 





WECHSLER MEMORY SCALE PERFORMANCE OF 
PSYCHONEUROTIC, ORGANIC, AND 
SCHIZOPHRENIC GROUPS’ 


JACOB COHEN 


BRONX VETERANS ADMINISTRATION HOSPITAL? AND NEW YORK UNIVERSITY® 


4 \ J ITH the development in recent years 
of pattern analysis of intelligence 
scales, clinical psychology has made 
further strides toward the objectification of di- 
agnostic psychological testing. Wechsler [8] 
and Rapaport [7] have indicated that the 
scores obtained by a subject on subtests designed 
to measure cognitive functions (such as the 
Wechsler-Bellevue Intelligence Scale) can, 
when studied in their interrelationships, give 
insights in personality evaluation and clinical 
psychiatric diagnosis. 

Since neuropsychiatric patients who fall in a 
given diagnostic entity tend to have somewhat 
similar patterns in intelligence test perform- 
ance, which patterns differ among various di- 
agnostic entities, the hypothesis suggests itself 
that such patterns exist in other spheres of cog- 
nitive behavior. One such sphere is memory 
functioning. Different diagnostic groups may 
yield characteristic patterns of memory func- 
tioning when investigated by means of the sub- 
tests provided by the Wechsler Memory Scale 
[9]. Such patterns, as defined by differences 
among diagnostic groups on various scores 
yielded by the Wechsler Memory Scale, would 
be of both practical diagnostic and theoretical 
utility. 

With only a few exceptions, studies on 
memory in the literature are confined to nor- 
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mals working under various experimental 
conditions, and, more recently, to the brain- 
damaged (organics). Since these later studies 
compare memory functions of organics with 
normals, whatever their theoretical importance, 
they are of sharply limited diagnostic utility. 
The initial problem, faced daily by the clin- 
ician engaged in psychodiagnosis, is to decide 
whether a patient is psychoneurotic, psychotic, 
or organic. Thus, the purpose of this investiga- 
tion was to assess the nature and significance 
of differences among representative clinical 
groups in various measures from the Wechs- 
ler Memory Scale, with a view towards es- 
tablishing patterns of memory functioning use- 
ful in differential diagnosis. 


RELATED STUDIES 


There is little to be found in the literature 
bearing directly on the problem under consid- 
eration. However, some work has been re- 
ported on the effect of brain damage on mem- 
ory. In the most recent review of the literature, 
Klebanoff [5] comments on the paucity of 
systematic quantitative research in this area 
and concludes, “. . .the results have demon- 
strated nothing that may be considered even a 
consistent trend, findings being inconsistent 
and contradictory” [5, p. 608]. 


Armitage [2] tested organics, neurotics, and 
normals on the immediate recall of a short 
story, and found the organics significantly 
poorer in this function than the other two 
groups, which did not differ from each other. 
Graham and Kendall [3] also succeeded in ob- 
taining highly significant differentiation be- 
tween brain-damaged patients and an equated 
mixed group of functional psychotics, psycho- 
neurotics, and general medical patients on a 
fifteen-item test for immediate recall of simple 
geometric designs. 
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On the other hand, when Aita, Armitage, 
Reitan, and Rabinovitz [1] compared a brain- 
injured group with a mixed neuropsychiatric 
control group on a group of tests, some of 
which were memory tests, they concluded that 
neither the Wechsler-Bellevue Digit Span 
subtest, nor the immediate recall subtests of 
the Hunt-Minnesota Test for Organic Brain 
Damage [4] are of value in differentiating be- 
tween brain-injured and mixed neuropsychi- 
atric groups. Thus, the results of these repre- 
sentative studies seem to indicate that some 
memory functions are impaired by brain 
damage while others are not. However, such 
a conclusion remains tentative, since the studies 
cited are not directly comparable with each 
other in terms of the constitution of either the 
experimental or control groups. 

Finally, Zangwill [10], on the basis of some 
preliminary work attempting to distinguish 
between memory defects in neurotics and or- 
ganics, concludes that tests of rote learning, 
such as the paired associates test, “. . .appear 
to be the only type which give satisfactory re- 
sults, .. .(since) . . . .the learning of simple 
test material is virtually impossible in cases 
with gross organic memory disturbance” [10, 
p. 579]. 

An analysis of the data of the present study 
makes possible a partial test of some of the 
above hypotheses and findings. 


SUBJECTS 


The research population consisted of 144 
white male neuropsychiatric patients, all 
World War II veterans between the ages of 
20 and 40. They were all examined by the 
clinical psychology staff at Bronx VA Hospital 
between January 1946 and December 1947, 
and final psychiatric diagnoses were available. 
Patients with “mixed” diagnoses, i.e., organ- 
ics with neuroses, and doubtful cases were 
not used. Each patient had had a complete 
Wechsler-Bellevue Intelligence Scale and a 
Wechsler Memory Scale during the regular 
psychometric evaluation, and the research data 
were obtained from these forms. The experi- 
mental groups were constituted as follows: 


Psychoneurotics. This group included patients 
with functional disorders who are not psychotic. 
Patients with diagnoses of “psychopathic person- 
ality,” “character disorder,” etc., and psychosomatic 
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cases were excluded. There were 81 patients in 
this group, of whom 31 were diagnosed “psycho- 
anxiety state,” 11 “psychoneurosis, con- 
version hysteria,” and 15 “psychoneurosis, mixed.” 
The remainder were miscellaneous types. 

Organics. This group included all cases diag- 
nosed as intracranial organic pathology 
without psychosis or psychoneurosis. Of the 45 pa- 
tients in this group, 15 were posttraumatic en- 
cephalopathies, 11 brain tumor and cyst 
cases, 9 were cases of encephalitis, and the re- 
mainder were miscellaneous. 

Schizophrenics. This group included all patients 
so diagnosed, irrespective of severity. There were 
18 such patients in all, of whom 10 were para- 
noids, 4 simples, 3 unclassified, and one _ hebe- 
phrenic. 


neurosis, 


having 


were 


For each group, the means and standard de- 
viations for age and 1Q were computed, and the 
groups were approximately equated for these 
two factors by elimination of a few cases. The 
resulting measures for the three groups are 
given in Table 1. 


TABLE 1 
Ace AND IQ CHARACTERISTICS OF THE 























Age IQ 
N Mean SD Mean SD 
Paychoneurotics at 81 29.1 5.7 102.8 13.7 
Organics 45 27.4 4.7 99.1 14.0 
Schizophrenics 18 28.1 5.6 102.0 14,2 
PROCEDURE 


The variables were the following: 


1. A score obtained by subtracting the patient’s 
Wechsler-Bellevue Full Scale IQ from his Wech- 
sler Memory Scale quotient (MQ—IQ). This score 
was intended to measure discrepancies between 
general intellectual and general memory function- 
ing. 

2. Total score on Mental Control (Subtest III 
of the Wechsler Memory Scale. All further vari- 
ables refer to subtests and parts of subtests on this 
scale.) This subtest measures, in terms of speed 
and errors, ability to count backwards from twenty 
to one, to recite the alphabet, and to count by 
threes from one to 40. 

3. Total score on Logical Memory (IV). This 
test is made up of two six-line news-item-like stor- 
ies, each of which is read to the subject, immediate 
recall for which is measured by the average num- 
ber of memories reproduced. 

4. Score on Logical Memory, Part A (IV A). 
The score is the total number of memories for the 
first story. 

5. Score on Logical Memory, Part B (IV B). 
The score is the total number of memories for the 
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TABLE 2 
MEANS, STANDARD DEVIATIONS, AND F VALUES OF THE 15 WecCHSLER MEMORY 

















SCALE VARIABLES IN THE EXPERIMENTAL GROUPS 





Psychoneurotic Organic Schizophrenic 
Variables Mean SD Mean SD Mean SD F 
MQ —IQ —5.2 12.1 —3.6 12.4 —8.0 9.7 a 
III 6.5 2.2 6.8 23 6.6 2.5 ° 
IV 8.9 3.6 8.9 3.3 9.2 2.7 ° 
IVA 9.4 4.2 9.2 4.1 9.6 3.8 © 
IVB 8.3 3.8 8.7 4.0 8.8 4.0 ° 
IVA—IVB 1.3 3.5 6 3.9 8 3.6 ° 
VA 6.3 1.3 6.1 1.3 6.3 9 ° 
VB 5.0 1.3 4.6 1.3 4.9 1.5 1.1 
VA—VB 1.3 1.3 1.6 1.1 14 1.5 ° 
V 11.3 2.3 10.7 2.3 11.2 1.9 ad 
VI 8.6 3.1 8.4 3.0 7.4 3.6 1.1 
VII 14.7 3.7 14.2 4.4 15.5 2.9 2.3 
VIIA 8.3 av 7.9 1.4 8.3 off 2.2 
VIIB 6.4 3.0 6.2 3.5 7.2 2.6 a 
VIIA—VIIB 1.9 2.7 1.7 3.0 1.1 2.2 ° 
*The within-group variance exceeds the between-group variance, therefore F* is less than unity and need 


not be computed, since it is a purely chance result. 


second story. 

6. Score on Logical Memory, Part A minus 
score on Logical Memory, Part B (IV A — IV B). 
This measure was used with the intent of measur- 
ing ability to shift from one story to the next. 
Digits Forward (V A). This test 
measures ability to repeat digits presented orally 
by the examiner in series of increasing length, from 
digits. If one series of digits is 
failed, a different series of the same length is pre- 
sented. If the second series is failed, the test is 
discontinued and the score is the greatest number 
of digits correctly repeated. 

8. Score on Digits Backward (V B). This test 
consists of a series from three to seven digits in 
length, and scored identically with 
Digits Forward, except that the patient is required 
to repeat the digits in reverse order. 

9. Score on Digits Forward minus score on 
Digits Backward (V A — V B). This score was 
intended to measure ability to shift from one rote 
memory task to its opposite. 

10. Total score on Digit Repetition (V). This 
was obtained by adding the scores on Digits For- 
ward and Digits Backward. 


7. Score on 


four to eight 


administered 


11. Total score on Visual Reproduction (Vi). 
The patient is required to reproduce from memory 
simple geometric figures exposed for ten seconds 
and the reproduction is scored according to objec- 
tive criteria set down in the manual [9, p. 93). 

12. Total score on Associate Learning (VII). 
This test consists of ten word pairs presented oral- 
ly to the subject, who is required to respond with 
the second word of each pair when the first is 
given. Of these, six are easy related associations 
(i.e., north-south, fruit-apple) and four are diffi- 
cult unrelated associations (i.e., obey-inch, cab- 


bage-pen). There are three presentations of the 
list, each followed by a The total 
score is obtained by adding one half of the total 
number of correct easy associates to the total num- 
ber of correct hard associates for the three trials. 

13. Score on Associate Learning, Easy Pairs 
(VII A). This is one half of the number of correct 
easy associates for the three trials. 

14. Score on Associate Learning, Hard Pairs 
(VII B). This is the number of correct hard asso- 


comp iete test. 


ciates for the three trials. 
15. Score on Associate Learning, Easy Pairs 
minus score on Associate Learning, Hard Pairs 


(VII A — VII B). This measure is intended to 
provide a comparison between old learning (Easy 
Pairs) and new learning (Hard Pairs). 


The means and standard deviations of each 
of the three groups on each of the above fifteen 
variables were computed, and the significance 
of differences among the three means for each 
variable was investigated by means of an anal- 
ysis of variance into two parts using Snedecor’s 


F ratio [6, pp. 87-104]. 


RESULTS 


The results of the statistical analysis are pre- 
sented in Table 2. Since none of the F values 
even approaches significance at the 5 per cent 
level of confidence,* the null hypothesis, i.e., 


*Table 4 of Lindquist [6, pp. 62-65] was referred 
to for significance of F values. For 2 and 141 de- 
grees of freedom, F values of 3.06 and 4.76 are 
necessary, respectively, at the 5 per cent and 1 
per cent levels of confidence. 
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that there are no differences among the group 
means of the variables other than those due to 
chance, cannot be refuted, and the results are 
consonant with the hypothesis that there are 
no real (i.e., extra-chance) differences among 
the groups with respect to any of the fifteen 
memory variables investigated. 

Furthermore, an inspection of the distribu- 
tions of each variable for each of the three 
groups reveals no significant differences in 
their shape or variance. 


THE RESULTS COMPARED WITH THOSE OF 
OTHER INVESTIGATORS 


Klebanoff states that it is neurological text- 
book doctrine that organic brain disease ini- 
tially affects recent memory functioning with 
remote memory functioning remaining intact 
unless the disease process is a degenerative one, 
in which case it, too, is ultimately affected ; in 
severe traumata or disease, both recent and re- 
mote memory ability are held to be affected 
[5, p. 605]. These contentions are partially 
tested in the present study, insofar as memory 
for easy and hard associated word pairs may be 
considered measures of old and new learning 
ability, respectively. The four variables used 
to tap this function (VII, VII A, VII B and 
VII A—VII B) did not distinguish among the 
groups. It may still remain true that organics 
do poorly on the hard pairs when compared 
with normals, but the findings indicate that 
they do no worse than neurotics or schizo- 
phrenics. It may be contended that this test 
does not measure the functions meant by the 
neurologists when they speak of old and new 
learning, but since the easy associations are of 
relationships laid down in childhood and con- 
stantly reinforced, and the hard associations, 
being unrelated, must be learned “on the spot”’ 
by the patient, the test would seem to be at 
least partially tapping the function concerned. 


In Armitage’s [2] comparison of the num- 
ber of memories immediately recalled follow- 
ing the reading of a story, he found that or- 
ganics recalled significantly fewer memories 
than did a combined control group of normals 
and neurotics, while the latter two groups did 
not differ significantly from each other. The 
present study found no differences among the 
groups in performance on a similar test (IV). 
This discrepancy is probably a function of the 
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greater length of the story used by Armitage 
[2], and possibly of the differing content of 
the stories, which cannot be evaluated since 
he does not present his material. 


The findings of the present study agree 
closely with those of Aita et al. [1] with re- 
spect to the failure of the Digit Span test (V) 
to discriminate between organics and others. 
In agreement with those authors, too, were the 
mean scores, which were approximately six 
digits forward and five backwards in all three 
groups. 

The relatively excellent differentiation 
achieved by Graham and Kendall [3] between 
organic and other patients with their Memory- 
for-Designs test is difficult to evaluate because 
their groups were not equated for IQ. They 
were equated for number of school years com- 
pleted, and although this would tend to equate 
them for premorbid intellectual functioning, 
it leaves the question of degree of general in- 
tellectual deterioration present at time of test- 
ing unanswered. Thus, it is quite possible that 
the differences they found were not specifically 
differences in memory functioning at all. Fur- 
thermore, a large majority of their organic pa- 
tients were diagnosed as paretics, making their 
findings not representative of brain-damaged 
patients generally. Another consideration help- 
ing to explain the difference between their 
findings and those of the present study is the 
fact that they used fifteen designs as compared 
to the four in subtest VI of the Wechsler 
Memory Scale, making their test more reliable 
and increasing whatever differential validity 
the test may have. 

Nor does the present study «onfirm Zang- 
will’s [10] findings. The paired associates 
subtest and their combinations (VII, VII A, 
VII B, and VII A — VII B) all failed to 
discriminate significantly among the groups 
studied. 


SUMMARY AND CONCLUSIONS 


The memory functioning of psychiatrically 
diagnosed groups of psychoneurotic, organic, 
and schizophrenic veteran patients, equated 
for age and IQ, was compared for 15 scores de- 
rived from the Wechsler Memory Scale with 
the purpose of finding diagnostically useful 
patterns of memory ability. Using a two part 
analysis of variance, no significant differences 
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age’ reliance is frequently placed upon 

psychological techniques. In general when 
psychotic behavior is not being evidenced by the 
patient, the diagnosis of brain damage is made 
easier in that the subject is better able to re- 
spond to tests and questions and it is possible 
to determine relatively easily whether or not 
there has been loss of function. If the patient’s 
behavior, however, is psychotic or psychotic- 
appearing, it is extremely difficult to determine 
whether the loss of function is related to ob- 
servable damage to the brain. It is in this diff- 
cult diagnostic situation that differentiating 
psychometric tools are most needed and most 
useful. 

Various psychological techniques are pres- 
ently employed for the purpose of detecting 
brain damage and differentiating it from 
“functional” loss. None of these has thus far 
yielded very high validity and it is doubtful 
that they add greatly to the neurological ex- 
amination routinely administered by medical 
practitioners. One of these devices is the 
Wechsler Memory Scale (WMS), the au- 
thor’s [12] sole statement regarding its utility 
being that it “. . should be useful in detecting 
special memory defects in individuals with 
specific organic brain injuries. . . .”” The only 
study of the WMS known to the present 


[: the diagnosis of “organic brain dam- 


1The author wishes to express his appreciation to 
Dr. Julian B. Rotter for many helpful suggestions 
and criticisms. He is also grateful to Dr. Ranald 
A. Wolfe, Chief Psychologist, Veterans Adminis- 
tration Hospital, Chillicothe, Ohio and to the man- 
agement and staff of this institution for their co- 
operation in making this research possible. 


2Use of the term “organic” in this paper is not 
meant to imply any mind-body dichotomy, but 
rather will be considered synonymous with a syn- 
drome medically associated with observable, struc- 
tural damage to the brain. 
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writer is that of Cohen [3], conducted at the 
Bronx Veterans Administration Hospital, New 
York. Cohen administered the W MS to groups 
of “psychoneurotics,” “organics” and “schizo- 
phrenics” and did not find any results signifi- 
cant at the 5 per cent level of confidence. 

Numerous related investigations have been 
conducted in this area. Employing the Wells 
and Martin [14] “Memory Examination” 
with 179 patients given “functional” or “or- 
ganic’ diagnoses, Wells [13] concludes that 
substitution and hard town-state items are af- 
fected most adversely in “psychotic conditions.” 
The Shipley-Hartford Retreat Scale [11] has 
not been proved useful for differentiating the 
“organic” from the “nonorganic” nor has H. 
F. Hunt’s [7] technique, inasmuch as no indi- 
viduals considered to be psychotic were in- 
cluded in the standardization. Meehl and Jeff- 
ery [10] question the specificity of the Hunt- 
Minnesota test for organic brain damage. 
Graham and Kendall [6] feel that their 
“. . «measure of visual-motor ability has elim- 
inated the elements common to deterioration 
with both age and brain damage, retaining on- 
ly those due to brain damage alone.” Follow- 
ing a critical survey of the literature, Kleban- 
off [8] concludes that the findings regarding 
memory functioning are inconsistent and con- 
tradictory. He expresses the opinion that 
“. . gross estimates of memory ability can never 
be as satisfactory as standardized and quanti- 
tative studies of the relative reproduction of 
recent and remote memories.” 


PROBLEM 


As a combined result of its rather common 
use and the dearth of information regarding 
its efficacy, an attempt was made to determine 
the degree to which the WMS was useful in 
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EXPERIMENTAL AND ContTrot Group MEANS ON Four SELECTED VARIABLES* 


Comb. Gps. Par.-Enc. Paretic A Encephalitic Epileptic Paretic B 
(N86) (N=—60) (N40) (N=—=20) (N26) (N30) 
Age 49-8 (48-8) 49-7 (48-6) 51-7 (50-5) 45-8 (44-7 49-10 (49-4) §1-6 (51-1) 
Hosp....... 4-4 (4-4) 4-5 (4-5) 5-1 (5-1) 3-1 (3-1) 4-0 (4-1) 4-1 (4-1) 
Educ... 3.07 (3.2)F 2.8 (3.1) 2.6% (3.1) T 3.0 (3.1) 3.5 (3.3) 3.0$ (3.4) 
Occ. 3.7 (3.7) 3.6 (3.4) 3.7 (3.5) 3.5 (3.4) 3.8 (4.2) 3.6 (4.0) 
*Control group data appear in parentheses in the body of the table. 


tEducational data in the paretic group available for 
tEducational data available for only 13 experimentals. 


distinguishing a group of diagnosed “organic” 
patients from a control group of mentally 
disturbed patients lacking such categorization. 
It was deemed desirable to ascertain the extent 
to which the Memory Quotient (MQ) asa 
whole might serve to differentiate between 
these two groups as well as to compute data 
indicating the level of significance of various 
individual items in relation to the general 
problem. Also believed to be of some value 
was the degree of correlation between length 
of hospitalization (as measured from the date 
of latest admission) and the MQ of the exper- 
imental subjects.* Another aspect of the re- 
search received its stimulation as a result of 
Wechsler’s [12] equating the MQ with the 
Wechsler-Bellevue full scale 1Q. The weighted 
vocabulary score of the W-B Form I* multi- 
plied by 10 was employed as a crude measure 
of previous level of function. The present lev- 
el as indicated by the MQ might then be sub- 
tracted from this figure to yield some estimate 
of loss of function, thus providing a possible 
basis for discriminating experimental and con- 
trol groups. The final aspect of the problem 
was to determine whether “remote’’® or “re- 
cent” scores would reveal any distinction be- 
tween groups. 


3A Pearson product moment correlation coeffici- 
ent of —.28 (S. E. = .14) resulted indicating that 
duration of illness as defined in this study has 
little effect on an individual’s MQ. Comparable 
findings are reported by Bebb [1], Collins [4] and 
Collins, Atwell and Moore [5]. Different conclu- 
sions are reached by Capps [2] and Lennox [9]. 

*Form II was utilized in three cases of prior 
administration of Form I. 


5Subtests included under the “remote” classifica- 
tion were Personal and Current Information, Ori- 
entation and Mental Control. “Recent” tests were 


Logical Memory, Digits, Visual Reproduction and 
Associate Learning. 


only 19 experimentals and 19 controls. 


PROCEDURE 
Initially the WMS and the W-B vocabu- 


lary were administered to 43 patients with 
psychotic and “‘organic’”’ diagnoses (experimen- 
tal group) and an equal number of patients 
with only a psychotic diagnosis (control 
group). The experimental group contained 20 
paretics (paretic A), 10 encephalitics and 13 
epileptics. A second group of 15 paretics (par- 
etic B) and 15 matched controls was later ob- 
tained for comparative purposes when analysis 
of initial results for experimental and control 
groups suggested the utility of the WMS for 
differentiating only paretic A from its con- 
trol. With few exceptions’ these individuals 
did not possess other medical diagnoses often 
associated with It 
may be assumed that whatever brain damage 
has occurred in paresis tends to be of a grosser 
nature, and 


“organic brain disorders.” 


is thus more readily observable, 
than that commonly expected in encephalitis 
or epilepsy. In general, the literature suggests 
that one might anticipate grosser damage in 
the encephalitic than the epileptic. This as- 
sumption may not be extended to posttrau- 
matic epilepsy, but no cases of the latter were 
included in the study. 

Means (Table 1) and ranges on four 
matched variables (age, length of hospitaliza- 
tion, education, and occupation) were approxi- 
mately the same for all groups studied. The 
“Comb. Gps.” in Table 1 refers to the com- 
bined data for all groups except paretic B. 


6In view of the negative findings for the en- 
cephalitic and epileptic groups, analysis of “re- 
mote” and “recent” data was confined to paretic A 
and its control. 

7One paretic and one encephalitic were diag- 
nosed cerebral arteriosclerosis; two encephalitics 
and one epileptic were diagnosed generalized ar- 
teriosclerosis; and one encephalitic was diagnosed 
chronic alcoholism. 
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TABLE 2 
EXPERIMENTAL AND ContTrot Group MEANS ON 22 SELecreD VARIABLES*® 








‘Comb. Gps. Par.-Ene. Paretic A Encephalitic Epileptic Paretic 
(N==86) (N==60) (N40) (N==20) (N=26) (N==30) 





W-B Wd. Vocab...... 





7.7 (9.0) 7.9 (8.8) 7.6 (9.2) 8.6 (8.2) 7.3 (9.4) 6.1 (7.9) 
WMS MQ 72 (83) 70 (83) 67 (84) 74 (81) 77 (83) 72 (83) 
P. and C. Info. 3.5 (4.6) 3.0 (4.7) 2.6 (4.7) 3.9 (4.7) 4.6 (4.5) 3.5 (5.0) 
Orientation i 3.5 (4.2) 3.2 (4.2) 2.8 (4.3) 4.0 (4.0) 4.2 (4.4) 3.9 (4.1) 
Ment. Cont. (Tota!) 4.7 (6.0) 4.5 (6.3) 4.0 (6.6) 5.5 (5.6) 5.3 (5.4) 4.1 (5.9) 
Count Backw. 20-1............... 1.9 (2.0) 1.9 (2.1) 1.9 (2.2) 2.1 (2.1) 1.8 (1.8) 1.9 (2.1) 
Alphabet 1.9 (2.3) 1.7 (2.3) 5 (2.5) 2.2 (1.9) 2.4 (2.4) 1.1 (2.2) 
Count. by Threes 9 (1.6) 8 (1.8) 7 (2.0) 1.2 (1.6) 1.1 (1.2) 1.2 (1.6) 
Log. Mem. ( Avg.) aa €a73 2.6 (3.8) 2.2 (3.8) 3.4 (3.8) 4.3 (3.5) 2.6 (2.9) 
Log. Mem. “A” 3.4 (3.9) 2.8 (4.3) 2.5 (4.3) 3.5 (4.5) 4.8 (2.8) 2.7 (3.1) 
Log. Mem. “B” 2.8 (3.5) 2.3 (3.2) 1.9 (3.3) 3.2 (3.0) 3.8 (4.2) 2.4 (2.6) 
Digits (Total) 8.7 (9.6) 8.7 (9.7) 8.1 (9.6) 9.9 (9.9) 8.8 (9.5) 8.5 (9.2) 
Digits Forward 5.3 (5.8) 5.3 (5.9) 5.1 (5.9) 5.7 (6.1) 5.3 (5.5) 5.4 (5.6) 
Digits Backward 3.4 (3.8) 3.4 (3.7) 3.0 (3.7) 4.2 (3.8) 3.5 (4.0) 3.1 (3.6) 
Vis. Rep. (Total) 2.5 (5.0) 2.5 (5.3) 2.1 (5.5) 3.2+ (4.9) 2.5 (4.3) 3.1 (4.4) 
Vis. Rep. “A”. 1.1 (1.4) 1.2 (1.4) 1.2 (1.3) 1.1 (1.5) 1.0 (1.5) 1.2 (1.3) 
Vis. Rep. “B” & (1.8) 9 (2.1) 7 (2.2) 1.3 (1.9) 6 (1.2) 9 (1.3) 
Vis. Rep. “C-1"...... 2 (1.1) A (1.1) A (1.1) 3 (1.2) 5 (1.2) 7 (1.3) 
0 a 3 (6) 2 (.7) 2 (.9) 4 (3) 4 (5) 3 (.5) 
Assoc. Lng. (Total). 7.9 (9.4) 7.3 (9.2) 6.9 (9.0) 8.2 (9.5) 9.2(10.0) 7.6(10.5) 
Assoc. Lng. (Easy) 5.8 (6.1) 5.4 (6.1) 5.1 (6.0) 6.2 (6.1) 6.6 (6.1) 6.1 (7.1) 
Assoc. Lng. (Hard) 2.1 (3.4) 1.9 (3.1) 1.8 (3.0) 2.0 (3.4) 2.6 (3.9) 1.5 (3.5) 








| 
| 
| 





*Oontrol group data appear in parentheses in the body of the table. 
+Based on nine cases. 























TABLE 3 
F RATI0os ON 22 VARIABLES FOR SELECTED GROUPS 
Comb. Gps. Par.-Enc. Paretic Enceph. Epil. 
(N==86) (N60) (N40) (N=20) (N=26) 
5 %o—=3.96 5 %o—4.00 5%—4.10 5 %o—4.38 5 Yo—4.24 
1%=6.96 1%=7.08 1%—=7.35 1%—=8.18 1%=7.77 
W-B Wed. Vocab................. 3.80 1.28 2.29 .09 3.29 
IE ae 11.92°° 13.97%° 16.05** 1.00 69 
ee 1 ee 7.58** 12.03** 13.15** 91 .08 
Orientation......2.................. 8.92** 10.68** 14.01** 0 0 
Ment. Cont. (Total) -......... 5.25* 7.76** 13.30** 0 0 
Count. Backw. 20-1-............ 45 65 1.10 0 0 
Alphabet_........... saints 2.36 3.05 6.52* oe 0 
Count. by Threes_.... aia 7.49** 10.56** 16.73** 38 0 
Log. Mem. ( Avg.) -...-.--.--- 1.88 $.12° 5.82* .21 1.01 
eee | nes .68 4.73* 4.26* -74 3.85 
fe rs 2.96 3.05 5.82* 06 30 
OS | ——— 3.36 2.77 5.63* .02 38 
Digits Forward... 1.77 2.53 2.85 41 68 
Digits Backward_..._......... 2.67 1.02 3.54 .29 88 
Vis. Rep. (Total) -............ 16.39** 14.11** 14.13** 1.35 2.64 
4 | ae 3.09 1.01 14 .90 2.63 
a 11.47** 11.10** 13.89** 58 1.44 
pS ae 23.09** 20.36** 18.18** 3.40 3.46 
oe gS ern 2.93 3.83 5.47* 19 .04 
Assoc. Lng. (Total) -........ 1.76 2.07 1.56 42 19 
Assoc. Lng. (Easy) -............ .73 1.06 1.44 0 .23 
Assoc. Lng. (Hard) -....... 4.76* 3.04 1.87 1.09 1.00 





*Denotes significance above the 5% level of confidence. **Denotes significance above the 1% level of confidence. 
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TABLE 4: 


F RATIos ON 22 VARIABLES FOR 


W-B Wed. Vocab. 
WMS MQ 

P. and C, Info. 
Orientation. , 
Ment. Cont. (Total) 
Count. Backw. 20-1 
Alphabet 

Count. by Threes 
Log. Mem. ( Avg.) 
Log. Mem. “A” 
Log. Mem. “B” 
Digits (Total) 
Digits Forward 


| 
nh 


Digits Backward 


Vis. Rep. (Total) 


Vis. Rep. “A” 

Vis. Rep. “B” 

Vis. Rep. “C-1” 

ig ug, iL: A nena eeee n 


Assoc. Lng. (Total) 
Assoc. Lng. (Easy)..... 
Assoc. Lng. (Hard)...... 


*Denotes significance above the 5% level of confidence. 


“Par.-Enc.” denotes a similar combination of 
data for paretic A and the encephalitic group 
out of deference to those who prefer to ex- 
clude the epileptic from consideration with 
“organic” cases. 

Several deviations from Wechsler’s scoring 
were made on Personal and Current Informa- 
tion items in view of the varied residences of 
the subjects. If the governor of a patient’s 
home state was in office during the current 
hospitalization, item five (Governor) was 
credited. Item six (Mayor) was excluded 
from the scoring, the subtest score being com- 
puted on a prorated basis. 


RESULTS 


Means on the W-B vocabulary and 21 
WMS items for all groups in both studies are 
presented in Table 2. The F ratios obtained 
by an analysis of variance technique employed 
on the 22 test variables are cited in Tables 3 
and 4. It will be noted that the addition of the 
15 paretics from group B did not alter the 
levels of significance recorded by paretic A 
with the sole exception of Logical Memory 


SEPARATE AND 


ComBiNnep Paretic Groups 


Paretic A Paretic B Paretic A and B 
(N = 20) the 325) (iy == 35) 
5% —= 4.10 §%o = 4.18 5% = 3.98 
1% = 7.35 1% 7.60 1% 7.01 
2.29 3.18 5.31° 
16.05** 5.00* 20.27°* 
13.15** 5.24*° 11.33** 
14.01** 31 9.40** 
13.30** 3.68 16.00** 
1.10 50 1.57 
6.52* 6.08* 12.75** 
16.73** .76 11.47** 
5.82* 26 5.53° 
4.26* 31 4.21° 
5.82* .07 3.49 
5.63* 87 7.10°* 
2.85 29 1.99 
3.54 1.84 7.86°%* 
14.13** 1.81 13.86** 
14 40 .90 
13.89** 1.22 12.65** 
18.18** 2.35 15.00°* 
5.47% 44 5.08* 
1.56 4.29" +.77° 
1.44 4.31* 3.03 
1.87 3.80 5.52* 
**Denotes significance above the 1% level of confidence. 


“< 


B.” Several items originally below the 1 per 
cent or 5 per cent level of confidence attain a 
greater degree of statistical significance as a 
result of the inclusion of paretic B and its con- 
trol. Other findings of the initial phase of the 
study also were unchanged by the incorpora- 
tion of paretic B data. The MQ does not dif- 
ferentiate sharply between combined groups 
nor does subtraction of the MQ from the 
W-B weighted vocabulary score multiplied by 
10 offer critical scores which effectively dis- 
criminate any of the experimental groups from 
their respective controls. Many MQs were 
higher than the vocabulary measure. Some- 
what analagously, Cohen [3] subtracted the 
W-B full scale IQ from the MO and found 
an F ratio “less than unity.” Intragroup com- 
parisons of “remote” and “recent” scores of 
paretic experimentals and controls reveal no 
significant difference. When ‘ 


‘ 


‘remote’ minus 
%? . . 

‘recent’’ scores are subjected to intergroup 

comparison, the experimentals’ 


ne 
“ ” > 
on remote tests is poorer. 


performance 


The studies here reported suggest that fur- 
ther work on the WMS is desirable with par- 





380 


etic patients. Special emphasis may be applied 
profitably to those items significant above the 
1 per cent level of confidence. In this manner, 
the present results may be checked and the 
value of the WMS with various groups of 
paretics ascertained. 


SUMMARY 


1. A study of the Wechsler Memory Scale 
utilizing matched groups of so-called “psy- 
chotic-organic” and “psychotic-nonorganic” pa- 
tients is reported. In the initial phase of the in- 
vestigation, the experimental group consisted 
of individuals diagnosed as paretic, encephalit- 
ic and epileptic, whereas in the final phase ad- 
ditional groups of paretics and matched con- 
trols were incorporated for comparative pur- 
poses. 

2. Analysis of results failed to yield a dis- 
crimination between the encephalitic and epi- 
leptic groups and their respective controls. 
However, the WMS did differentiate between 
a group of 35 paretics and matched controls 
on certain items. 

3. Items revealing significant differences be- 
tween the paretics and their controls above the 
1 per cent level of confidence were Memory 
Quotient, Personal and Current Information, 
Orientation, Total Mental Control, Counting 
by Threes, Total Visual Reproduction, Visual 
Reproduction “B” and “C-1.” Above the 5 
per cent level of confidence were Alphabet, 
Average Logical Memory, Logical Memory 
“A,” ‘Total Digits and Visual Reproduction 
“C-2.” 

4. It is hypothesized that the WMS will 
discriminate successfully between matched 
groups when brain damage tends to be of a 
gross nature. 
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N 


10. 





ALVIN R. HOWARD 


REFERENCES 


Bess, Grace L. A study of memory deterio- 
ration in encephalitis .lethargica. J. serv. 
ment. Dis., 1925, 61, 356-364. 

Capps, H. M. Vocabulary changes in mental 
deterioration. Arch. Psychol., N. Y., 1939, No. 
242. 

CoHEN, J. Wechsler Memory Scale perform- 
ance of psychoneurotic, organic and schizo- 
phrenic groups. J. consult. Psychol., 1950, 14, 
371-375. 

Cotiins, A. L. Psychometric records of insti- 
tutionalized epileptics. J. Psychol., 1941, 11, 
359-370. 

Cotiins, A. L.,. ATWELL, C. R., AND Moore, 
MERRILL. Stanford-Binet response patterns in 
epileptics. Amer. J. Orthopsychiat., 1938, 8, 
51-63. 
GRAHAM, 
BARA §. 


FRANCES K., AND KENDALL, BAr- 
Performance of brain-damaged cases 
on a memory-for-designs test. J. abnorm. soc. 
Psychol., 1946, 41, 303-314. 

Hunt, H. F. A practical clinical test for or- 
ganic brain damage. J. appl. Psychol., 1943, 
27, 375-386. 


KLEBANOFF, S. G. Psychological changes in 


organic brain lesions and ablations. Psychol. 
Bull., 1945, 42, 585-623. 
Lennox, W. G. Seizure states. In J. McV. 


Hunt (Ed.), Personality and the behavior dis- 


orders. New York: Ronald Press, 1944. Vol. 
II, pp. 938-967. 
Meen., P. E., AND JEFFERY, MAry. The 


Hunt-Minnesota Test for organic brain dam- 
age in cases of functional depression. J. appl. 
Psychol., 1946, 30, 276-287. 

SuipLey, W. C. A self-administering scale 
for measuring intellectual impairment. J. Psy- 
chol., 1940, 9, 371-377. ; 

Wecusier, D. A standardized memory scale 
for clinical use. J. Psychol., 1945, 19, 87-95. 
We ts, F. L. Mental tests in clinical practice. 
Yonkers, N. Y.: World Book Co., 1927. 
Vetus, F. L., AND Martin, Heten A. A. A 
method of memory examination suitable for 
psychotic cases. Amer. J. Psychiat., 1923, 3, 
243-257. 





BEHAVIOR RATINGS OF POST-POLIO CASES 


DALE B. HARRIS 


INSTITUTE OF CHILD WELFARE, UNIVERSITY OF MINNESOTA 


N 1946 there was an occurrence of polio- 

myelitis in Minnesota, especially in the 

Twin Cities area, which reached epidemic 
proportions. Some 765 persons were stricken in 
the city of Minneapolis alone, 620 of them un- 
der 18 years of age. Quite complete school re- 
cords were available on 210 school children of 
this latter group. One hundred seven of these 
210 had had the Stanford-Binet individual in- 
telligence examinations prior to their illness. 

This group of 107 was studied intensively 
by Phillips, Berman, and Hanson, of the Min- 
neapolis Puplic Schools in the summer of 1947 
[6]. One hundred one were reexamined on 
the Stanford-Binet, together with 101 con- 
trol children matched on CA, sex, and IQ on 
initial tests. In any epidemic “scare,” there is 
the possibility that many illnesses are wrongly 
diagnosed as the one of great concern at the 
moment. This possibility was considered and 
the evidence for the diagnosis of polio in all 
clinical cases was very carefully reviewed by 
Dr. Hanson, the medical author of the paper. 
The monograph states: “From the evidence 
presented herein, there seems to be little doubt 
that these cases were bonafide poliomyelitis 
cases” [6, p. 45]. 

The monograph cites certain intellectual 
(IQ) changes in the clinical group, most of 
the “loss’’ occurring among the younger boys. 
There were no differences between the clinical 
and control groups on Hunt-Minnesota brain- 
damage tests and on the California Personality 
Test. The few cases wherein polio had a crip- 
pling effect did not differ in performance on 
any measure from the noncrippled post-polio 
cases. No positive results were obtained when 
diagnostic categories of the disease were com- 
pared, when groups of differing length of hos- 
pitalization were compared, or when groups 
differing in severity of spinal symptoms were 
contrasted. 

In discussion prior to the psychological study 


of their children, the mothers of about two 
thirds of the clinical cases volunteered the in- 
formation that their children were more “rest- 
less, irritable, impulsive and given to easy 
fatigue” [6, p. 19] since their illness. A number 
described an increase in nervous habits. The 
authors of the monograph are inclined to the 
belief that the California Personality Test is 
not sufficiently sensitive to uncover such per- 
sonality consequences of polio. They are wil- 
ling, however, to assign psychological conse- 
quences as much to long hospitalization or seg- 
regation as to the disease itself. 

The literature indicates some psychological 
repercussions of polio. Stanfield [8] quotes an 
unpublished Rorschach study of 11 children, 
all over 13 years. These records showed evi- 
dences of anxiety or neurotic shock, presumably 
associated with the patients’ evaluation of their 
condition. Follow-up records in some of the 
cases indicated reduction of these signs as the 
patients recovered. Lowman and Seidenfeld 
[4] show that about one-fifth of their post- 
polio cases (90 per cent having the disease 
prior to age 16) failed to make a reasonably 
satisfactory social adjustment as young adults. 
Rosenbaum’s survey of crippled girls [7] using 
the Thurstone personality schedule shows 
many more “neurotic” symptoms than in the 
norm group. Copellman [2] states with re- 
spect to 100 children under 15 who contracted 
polio: “It was the general impression that there 
was an unusually large number of (behavior) 
dificulties in this particular group” [2, p. 
292]. The social worker at the convalescent 
center noted more behavior problems than in 
any other clinical group. Thirty-eight of the 
100 children presented problems sufficiently 
serious to discuss them with the staff. Seven of 
the 100 required psychiatric treatment. Prob- 
lems commonly mentioned include irritability, 
a tendency to cry easily, to be hypersensitive, to 
be easily fatigued. A follow-up study indicated 
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improvement after 7 months but a still notice- 
able hypersensitivity. 

Meyer’s intensive work with 52 children 
[5] contains many references to hyperactivity, 
irritability, disobedience, whining, fatigability 
and crying. Almost all the children showed 
considerable effect and most were reported as 
considerably improved by the end of a year. 
These effects were noted in children with no 
residual paralysis as well as those retaining 
some motor impairment. Meyer states: “This 
study suggests that poliomyelitis, first as a spe- 
cific infection of the central nervous system, 
and second as a crippling disease leading to 
prolonged hospitalization and immobilization, 
tends to produce interference with normal 
mental and emotional development.” 


At least two authorities, Draper [3] and 
Aycock [1], believe that a particular kind of 
personality is more susceptible to polio virus. 
These children presumably show certain phy- 
sical stigmata reflecting endocrine malfunc- 
tion; presumably their emotional behavior 
patterns would reflect this difference also. 

It is not clear from published literature 
whether such personality effects of polio re- 
late to the kind of child who is particularly 
susceptible to the virus, to the disease itself, or 
to the conditions of strain and isolation im- 
posed by the disease. All such views are ad- 
vanced. Papers generally agree that certain 
emotional instabilities are noted, and those re- 
porting early follow-up [2, 5, 8] indicate 
some remission within six months or a year. 
Others note more enduring effects [4, 7]. 


Because many of the cases of the Phillips’ 
study were still available two years following 
the disease outbreak of 1946, a study was 
planned to follow up the clinical and control 
cases of elementary school age and to make 
certain observational studies in the classroom 
and on the playground, hoping to confirm or 
refute some of those so-called “personality 
consequences.” The cases were so widely dis- 
tributed through eighty-odd schools, however, 
that the direct observational follow-up was 
not feasible. In May of 1948, almost two 
vears after the epidemic period, 58 pairs of 
children were identified in the elementary 
schools, and ratings were obtained from their 
teachers on a number of characteristics. Us- 
able returns on 43 pairs were received. 





DALE B. HARRIS 


The ratings consisted of three instruments: 
The Haggerty-Olson-Wickman Schedule, and 
two shorter schedules prepared by the writer, 
one relating to personality qualities and the 
other to nervous habits.’ The Haggerty-Olson- 
Wickman BehaviorRating Schedule is familiar 
to all psychologists working with children. It 
is made up of two parts. Schedule A consists of 
the rating of the frequency of occurrence for 
15 typical behavior problems, including “‘defi- 
ance to discipline,” “marked over-activity,” 
“unpopular with children,” and the like. Sched- 
ule B is made up of a series of five-point graphic 
rating scales, divided into four groups. Divi- 
sion 1 relates to intellectual traits, Division 2 
to physical traits, Division 3 to social traits, 
and Division 4 to emotional traits. 

In addition to the Haggerty-Olson-Wick- 
man Schedule, graphic rating scales were pre- 
pared for 26 additional personality character- 
istics presumed from our survey of the litera- 
ture to relate to the possible effects of polio. 
They included hyperactivity, tendency to cry, 
fatigability, withdrawing behavior, irritability, 
self-discipline, etc. The extreme positions of 
each of these graphic rating scales were defined 
by a word or phrase, followed by supporting 
descriptive adjectives. The rater was asked to 
check at either of these positions or at any posi- 
tion intermediate. The teacher’s impression of 
children in general was the reference point in 
all instances. A score was assigned, according 
to an arbitrary division of the graphic scale in- 
to seven equal parts. An additional schedule 
entitled “Nervous Habits Study” included 10 
common tics and nervous habits, each of which 
was rated on a five-point scale—‘“never, sel- 
dom, occasionally, frequently, persistently.” 
The rater was also asked to supply certain ad- 
ditional impressions relating to the presence of 
any physical disabilities, speech defects, and 
comparison in physical size with other children 
in the room. 

These three schedules, the Haggerty-Olson- 
Wickman Behavior Rating Schedule, the Per- 
sonality Trait Inventory, and the Nervous 
Habits Study, so-called, were given at the end 
of the school year 1947-48 to teachers who had 
had the 58 clinical cases and their controls 


1The writer wishes to acknowledge his indebted- 
ness to Miss Katherine Nikolaisen, then a teach- 
ing assistant in the Institute, for assistance in pre- 
paring these schedules. 
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TABLE 1 
MEAN Scores AND STANDARD DEVIATIONS FOR POLIO AND Non-Powio Cases, 





Polio 

Schedule N Mean 
A on a Re 40 24.8 

B (total) * 38 71.4 

I Pines eRe <! 43 16.6 

II : ; 43 13.9 

Ill sansuhdtehabedncadees 43 21.3 

IV ws paees 38 19.3 


during that year. The teachers were asked to 
rate the children on the basis of their experience 
with them during the previous year. At no 
point was the study identified with the previ- 
ous study of post-polio cases (which had taken 
place during the preceding summer vacation) ; 
the children were merely identified as belong- 
ing to a selected sample drawn from all 
Minneapolis schools for a nervous habits and 
personality study by the rating method. 

An analysis of the various schedules yielded 
few positive results. Table 1 compares the 
mean scores and standard deviations for the 
polio and non-polio cases on the Haggerty-Ol- 
son-Wickman Schedule. Since these children 
were originally matched, allowance must be 
made in calculating the critical ratio between 
the means. Since the matching was done on in- 
telligence and there were no data on the correla- 
tion between the samples with respect to Hag- 
gerty-Olson-Wickman measures of personality 
before the event of polio, a correlation was cal- 
culated between the ratings of the pairs of 
children for total score on Part B, the most 
comprehensive schedule in the series. This 
value proved to be +.05 and was insignificant. 
Presumably, however, cases matched closely 
on intelligence initially might have been 
matched on personality as well. Therefore, we 
assumed a correlation of + .30 between the 
samples with respect to “personality” and cor- 
rected our critical ratio values with this value 
in the usual manner. This procedure serves 
statistically to decrease the likelihood that the 
observed differences could have occurred by 
chance. The critical ratios reported in Table 1 
are corrected by theis value. Despite this cor- 
rection, the total score of Schedule B, and Part 
2 of B are the only ones which can possibly be 
interpreted as showing significance. Since total 


HIAGGERTY-OLSON-WICKMAN SCHEDULES 


Non-Polio 
D. N Mean S.D. C.R. 
25.3 39 17.9 23.2 1.43 
18.6 39 64.5 13.6 2.19 
4.3 43 15.6 5.4 1.09 
3.5 43 15.6 1.9 3.23 
6.7 43 19.7 4.4 1.51 
6.8 39 17.6 5.2 1.44 


score of Schedule B is made up in part of Divi- 
sion 2, presumably the difference in total 
score is accounted for by the difference observed 
in Part 2, and by the accumulation of differen- 
ces, though nonsignificant, yet favoring the po- 
lio cases, in the other three parts. Part 2, it will 
be remembered, refers to physical characteris- 
tics of the child. 


An item analysis of Part 2 revealed that two 
items differentiate the polio and non-polio cases 
significantly (one per cent criterion) , according 
to a chi-square treatment of rating distribu- 
tions. On item10 the polio children are less of- 
ten rated “stronger than most,” and more of- 
ten rated as having some physical difficulties 
than the non-polio. On item 12, the polio cases 
are less often rated as “rarely showing fatigue” 
and more often rated as “not having ordinary 
endurance” than the non-polio cases. Three 
items fell between the 10 and 20 per cent levels 
of probability of differentiating the groups, ac- 
cording to the chi-square comparison of rating 
distribution. There is a possibility that the po- 
lio cases were rated somewhat less frequently 
as making a favorable impression with their 
physique and bearing and more often rated as 
generally unnoticed. The polio cases were pos- 
sibly somewhat less likely to be rated as ener- 
getic, vivacious, or moving with required speed 
and a little more likely than the non-polio 
cases to be rated as slow in action, or overac- 
tive. Also, there is the possibility that the polio 
cases were rated a little less often as willing to 
take reasonable chances and a little more often 
as getting “cold feet” in situations requiring 
nerve or courage. Two items were not at all 
significant—being slovenly or neat in personal 
appearance, and masculinity or femininity of 
physical characteristics. 

The two schedules composed of personality 
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TABLE 2 


Mean Scores, HAccertTy-OLsON-WICKMAN SCHEDULES, 
CATEGORIES OF POLIO 


AccoRDING TO DIAGNOSTIC 












































Category N A _B (Total) I I Ill IV 
es ee 10 17.6 72.4 13.7 13.2 23.5 19.1 
Se ee 18 31.7 75.9 17.5 14.9 21.8 20.9 
Pi nccinaciaccklicntadessiinisdicealiicn 6 21.2 63.2 16.8 13.3 18.7 17.0 
NE ae ee: 9 19.8 63.4 15.9 12.2 19.1 17.1 
TABLE 3 
MEAN HAccerty-OLsON-WICKMAN SCORES FOR THOSE WITH GREATER THAN MEDIAN 
Days OF HOSPITALIZATION (23), AND THOSE WITH Less THAN 

Mep1An Days oF HOSPITALIZATION 

N AB (Total) ie i M+ VS IV 
Cases above median...... = Shr Ce 22 23.7 70.9 16.1 13.5 21.4 19.8 
Cases below median...» 19 25.2 71.1 16.9 14.0 21.1 19.0 


traits and nervous habits were also item an- 
alyzed with negligible results. On one item 
only was a significant difference noted—the 
polio cases being rated as more unself-disci- 
plined, yielding to whims and erratic, than the 
non-polio cases. There is a possibility (at the 
5 per cent level) that the polio children were 
somewhat less frequently rated as “enjoying 
life, being unfettered by self or circumstances, 
being open, free and unbounded” and a little 
more often rated as being “constricted in per- 
sonality due to limitations of self or circum- 
stances—not enjoying life to the full.” 

On the nervous habits schedule, one item 
fell between the two and five per cent level of 
probability. This item refers to manipulation 
of genitals, the non-polio cases exceeding the 
polio in this instance. No other item came as 
high as even the 20 per cent level of probability. 

A question was asked regarding the pres- 
ence of physical deficiency. Here the chi- 
square between the polio and non-polio cases 
was significant, some 15 of the polio cases be- 
ing rated as having physical defects, and only 
four of the non-polio cases being so rated. On 
examination of the 15 questionnaires marked 
by the teachers as relating to post-polio chil- 
dren with noticeable defects revealed that in a 
number of these the teacher knew of the polio 
and reported the fact that “the child had to be 
more careful,”’ but that physical effects of the 
polio were not apparent to the casual observer. 
There was no significant difference in the 
groups with respect to physical size, speech de- 
fects, or whether the nervous habits, if any, 





were more likely to occur morning or after- 
noon. 

In the original records, data were available 
relating to the diagnosis of the polio condition, 
the number of days of hospitzlization, the 
number of days of fever, and the maximum 
temperature recorded. The polio cases were 
reexamined with respect to mean score on the 
Haggerty-Olson-Wickman Schedule according 
to the diagnosis of polio. Table 2 reports mean 
scores for the Haggerty-Olson-Wickman 
Schedules according to diagnosis. In this table 
the largest numerical differences, those of 
Schedule A, were tested for significance. None 
of these differences satisfied a 5 per cent proba- 
bility criterion. A distribution of the number 
of days of hospitalization was made and the 
cases divided into two groups, those above the 
median number of days, and those below the 
median. Mean scores on the various rating 
schedules did not differ appreciably for these 
two groups and were not tested for significance. 
Table 3 reports these mean scores. A similar 
result was found with number of days of fever. 

The maximum temperature reached like- 
wise appears to be unrelated to score on the 
Haggerty-Olson-Wickman Schedule. Table 4 
gives the results. While some of the differences 
noted therein might be “statistically signifi- 
cant,” they follow no logical pattern and must 
be discounted because of the very small num- 
ber of cases in the various categories. 

The net result of this study is to encourage 
the conclusion that polio apparently has little 
direct lasting effect on the children’s personali- 





BEHAVIOR RATINGS OF POST-POLIO CASES 


385 


TABLE 4 
Mean Haccerty-OLson-WICKMAN Scores For Cuitp Pow Patients Diviwep 
AccorpInc To Hicuest TemMperATure Recorpep 
N A B (Total) I ll Ill IV 
104 6 31.4 74.2 17.7 15.2 20.3 21.0 
103° 13 19.5 66.9 14.6 13.7 20.7 17.8 
102°. 11 35.7 60.0 22.1 14.0 23.2 22.2 
RS eae Ree. eS 6 13.2 64.4 14.5 12.7 20.0 16.0 
2 er 7 19.4 68.7 32.8 13.4 20.6 18.3 


ty or behavior. Apart from simple strength and 
endurance factors, only one statistically signifi- 
cant personality difference was located. This 
related to self-discipline and may very well re- 
late to indulgence of a presumably weakened 
child. 

It is possible that personality effects may 
follow polio, and relate to the extent or severi- 
ty of crippling conditions which alter the 
child’s social situation. Because few of the chil- 
dren in this sample showed any more than a 
very mild crippling condition, this aspect of 
the personality effects of polio could not be 
considered. 

Inasmuch as the teachers who made these 
ratings were unaware of the nature of the study 
and hence presumably gave reasonably object- 
ive ratings, one can find satisfaction in the ex- 
tremely negative results of the study. Consider- 
ing the mothers’ complaints which led to this 
particular inquiry, one may say that if those 
complaints were valid and not just observations 
of fond and apprehensive mothers, the bases 
for their judgments were no longer particular- 
ly apparent in the child’s school behavior after 
a lapse of two years. No support is found in 
these results for the hypotheses advanced in 
other articles that (a) polio may strike prefer- 


entially certain “immature” types, or that (b) 
polio accentuates certain personality weaknes- 
ses already implicit in the child’s personality. 
Received November 14, 1949. 
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FACTORS RELATED TO VOLUNTARY DISCONTIN- 
UANCE OF CONTACT DURING COUNSELING 


BARBARA A. KIRK anp ROBERT R. HEADLEY 


COUNSELING CENTER, UNIVERSITY OF CALIFORNIA, BERKELEY 


OST studies evaluating counseling 

have emphasized success. “Two me- 

thods have chiefly been used. One ap- 
plies an outside criterion to the success of 
counseling, either objective, such as grades, or 
subjective, as determined by ratings. The se- 
cond method consists of a follow-up of the re- 
action of counselees to the counseling process, 
usually by questionnaire. In the latter method 
only those who reply to such a questionnaire 
are ordinarily included in the study. 

An attempt to analyze failures rather than 
successes offers a different approach to coun- 
seling evaluation. The present study, while 
analyzing failures, represents an indirect ap- 
proach in that it is not concerned with the out- 
come of counseling, but rather with the coun- 
selee’s voluntary discontinuance before comple- 
tion, or arriving at a decision. It is recognized 
that failure to complete counseling does not 
necessarily impugn the quality of the counsel- 
ing, nor does continuance necessarily guarantee 
successful counseling. Yet continuance or dis- 
continuance gives indication of the feelings of 
the counselee regarding the value of the ser- 
vice he has sought. 

The method of analyzing discontinuance of 
counseling might also provide valuable infor- 
mation for professional workers for the im- 
provement of their techniques in counseling 
and the improvement of procedures in the op- 
eration of the counseling process. 

A survey of the psychological literature from 
1926 to the present reveals that an analysis of 
the failures and losses during counseling has 
not been undertaken and that this method of 
assessing the counseling process is yet to be ex- 
plored. 

In order to define discontinuance during 
counseling for the present study, a description 
of the pertinent procedures at the University 
of California, Berkeley, Counseling Center 


must be presented. The original contact with 
the Center is understood to be of immeasurable 
importance to the counselee, and every effort 
was made by the trained preliminary inter- 
viewer to understand his need, together with 
any ambivalence felt towards the counseling 
process. 


It was the Center’s policy not to transfer cases 
after original assignment, but rather to attempt in 
every way throughout the process to integrate the 
relationship between counselor and counselee. 

It is recognized that the number of interviews 
required for the solution of the problem presented 
varies with the individual, and it is the practice 
in this Center to continue counseling for as many 
interviews as may be desirable. 

Various methods of maintaining contact were 
given careful consideration and were constantly 
being evolved and improved. Every effort was made 
by the Center to indicate the Center’s interest in 
the counselee without placing undue pressure upon 
him for continuance if he did not at the time feel 
a desire for it. 


During the year the following method of 
maintaining contact was developed: After the 
counselor had taken the counselee to the testing 
division, this division took responsibility. Ap- 
proximately one week after a failed appoint- 
ment, a form post card was sent asking the 
counselee to contact the Center if he wished to 
complete his testing and counseling. If he was 
not heard from within another week a dupli- 
cate post card was sent. If no response was re- 
ceived from these two post cards, the counselor 
then decided whether any further contacts 
should be made, and if so, of what kind. He 
might call the counselee directly, leaving a 
message for him to call the counselor, or he 
might write a personal letter. The other alter- 
native was the more impersonal follow-up 
contact by the preliminary interviewer, thus 
giving the counselee greater leeway about re- 
sponding. 
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PROCEDURES 


The year 1947 was selected for this study 
because it was the first full calendar year in 
which the Center (established October 1946) 
was in operation as a Veterans Administration 
Guidance Center, and also because a full year 
had intervened before the study. During 1947 
only veterans were counseled at this Center, 
including 2,081 University of California stu- 
dents and 286 nonstudents. Most of the coun- 
selors were still relatively inexperienced. There 
were fourteen different counselors during this 
year, including graduate student trainees, with 
one supervisor for all. 


In order to insure objectivity, a competent 
clerk was assigned the project of examining 
the full case records for all of the 2,367 cases 
for whom counseling had begun. She listed 
each case in which there was any suggestion 
that a final solution had not been reached, or 
in which the records were technically incom- 
plete, indicating the stage at which counseling 
contact was lost by the Center. 

Although only 8 of the 14 counselors re- 
mained in the Center, all were contacted and 
agreed to try to evaluate the reason for loss in 
each case. 

“Loss” was defined as the counselee’s failure 
to continue, after the counseling process had 
once begun, at any time during the counseling 
process (either before or after an appointment 
had been made) without arriving at some de- 
cision with the counselor. This decision might 
be on a plan, or the postponement of planning 
or counseling, or not to proceed with coun- 
seling. 

Work with the data was not resumed until 
the spring of 1949. At that time the authors 
jointly reviewed every individual case included 
in this study, added their own comments, im- 
pressions, and evaluations, and attempted to 
develop classifications under which the various 
diagnoses of the counselors could be subsumed. 


RESULTS 
Of the entire 2,367 cases begun in 1947, 140 


had been considered “incomplete” prior to 
counselor review. This was 5.9 per cent of the 

Of 320 Public Law 16’s accepted during 
total intake. 


1947, 27 or 8.4 per cent discontinued as com- 
pared with 113 or 5.5 per cent of 2,047 Public 
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Law 346’s. 


As a result of the counselors’ evaluation the 
number of discontinuances was reduced by 
thirty. The reasons for the elimination from 
this study, with which the authors concurred 
in all cases, fell into five groups, listed in order 
of frequency of occurrence: 


1. Decision reached regarding training or em- 
ployment. 


2. Decision reached not to enter training under 


Public Law 16 or training desired under Public Law 
16 not available. 


3. Left the area during counseling and so in- 
formed the counselor. 


4. Decision reached to postpone counseling. 


5. Referral to Center was a Veterans Adminis- 
tration technical error. 


The remaining 110 cases, constituting the 
main group studied, are 4.6 per cent of the to- 
tal intake for that year. Of this group 16 were 
Public Law 16 referrals, or 5 per cent of the 
total Public Law 16 referrals, and 94 were 
Public Law 346 cases, or 4.6 per cent of the 
total Public Law 346’s. Only four cases were 
females. Only 87 or 79 per cent were students, 
compared with 92 per cent of the total group 
for the year. 

The stage in the counseling process of dis- 
continuance for the final group is seen in Table 


1. 


TABLE 1 
DISCONTINUANCE OF FINAL GROUP BY 
OF COUNSELING 


STACE 








Tota! Group P.L. 16's 


Stage in the process N 9 N o% 
Number dropped after 

first interview 16 14.5 A 25 
Numher dropped 

during testing 35 31.8 4 25 
Number dropped before 

second interview 9 8.2 2 12.5 


Number dropped after 

second interview 38 34.5 2 1: 
Number dropped after 

subsequent interviews 

before completion 12 11 4 25 





100% 


16 100% 

Approximately one-half dropped out before 
the second interview, and one-half completed 
at least the second interview. It is of interest 
that of the 50 cases who discontinued after the 
second interview, 13 were counselees of one 
counselor alone. This counselor had been rec- 
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TABLE 2 
Primary REASONS FOR DISCONTINUANCE OF COUNSELING 
Categories P.L.16 P.L.346 8Total % 
1. Counselee’s unwillingness or inability to face self-evalua- 
tion or take responsibility for taking steps towards a solu- 
tion. (Fear of test results, facing personal or emotional 
problems, unrealistic regarding objective.) ........................-------- 2 21 23 20.9 
2. Planning progressed to satisfactory solution, resulting in 
employment, training, or psychotherapy. (Only  techni- 
AN IIOGE  -citasinivhenirncicnerinrciiaipiiteiiceseiininbeeniamnsenit bless itiantiigtaenein 2 18 20 18.2 
3. Public Law 16 application suspended. (Unsure of benefits 
for which making application. Did not wish to pursue 
training, really wanted employment, or on-the-job train- 
ing for which opportunities not available.) —................ 12 0 12 10.9 
4. Interruption by circumstances beyond control. (As: On va- 
cation, certification to another Center, counselor’s illness, ad- 
ministrative refusal of academic transfer or re-entry.) -..... 11 11 10.0 
5. Left University (and area, for employment or another in- 
Ss ek etait phiniticdinsaneebatata cecellicg Ale Aaschalccephlochiocinivtinicomalibastnaieesincl 9 9 8.2 
6. Purpose of counseling not clear to counselee. (i.e. felt ap- 
propriate to defer counseling until ready to plan schooling, 
ee ee. ee cee eS ce 7 7 6.4 
7. Counselor handling inadequate. (Counselor’s feelings in- 
8. Veterans Administration handling, or Veterans Adminis- 
IL. UR ii deren nncenceisgehasneennentaaalinianisinommninss 4 4 3.6 
9. Delay in obtaining appointment necessitated solution outside 
ig RC et AAs SR ee een 2 2 1.8 
10. No information or insufficient information. —.—.... 16 16 14.5 
110 100.0% 








ognized as the least adequate of the profession- 
al staff members and his services were discon- 
tinued during this year. Whereas this finding 
may suggest an objective method of evalu- 
ating a counselor’s effectiveness, the general 
distribution of cases discontinuing in this study 
do not warrant positive affirmation. In most 
cases, the authors’ impression was that the 
counselors’ inexperience in the mechanical as- 
pects of records handling, rather than profes- 
sional competence or incompetence, resulted in 
the designation of “discontinuance.” Detailed 
analysis of discontinuances by counselor proved 
unfeasible because of wide variation in length 
of service, number of cases handled, and kind 
of case load. 

Tabulation of discontinuances by month was 
made because of the authors’ impression that 
there would be some relationship between dis- 
continuation and the end of academic periods 
as in February, June, and Summer Sessions. 
There appeared in these tabulations to be no 


marked association. There was some decrease 
during the latter months of the year, which 
could well be attributed to increased experi- 
ence on the part of the remaining counselors. 


In analyzing the reasons for discontinuance 
with this main group of 110 cases, it was de- 
cided to determine whether they would natur- 
ally fall into classifications. Categories were es- 
tablished on reviewing the counselors’ desig- 
nation of the primary reasons for discontinu- 
ance. The authors then considered all informa- 
tion included in the case record as well as the 
counselors’ evaluation, and tabulated by most 
appropriate category when any doubt existed. 
Factual information was always preferred to 
counselors’ speculation. The authors found 
themselves in agreement in assignment of cate- 
gories. The dispositions according to 10 cate- 
gories are seen in Table 2. 

Categories 6, 7, and 9, totaling 15 cases, 
clearly attribute the responsibility for the 
counselee’s discontinuance to the counseling 
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process or inadequate counseling techniques. In 
category 1, the responsibility was placed by the 
counselor on the counselee, but conceivably 
ideal handling might have prevented discontin- 
uance in some or all of these 23 cases. 

It would appear that in categories 2, 3, 4, 
and 5, totaling 52 cases, counseling had pro- 
gressed satisfactorily as far as the counselee de- 
sired, or as his circumstances permitted. ‘Thus 
in no more than 58 cases at the maximum, in- 
cluding those 16 for which there was no infor- 
mation, can it be conjectured that discontinu- 
ance resulted from the counseling process it- 
self. This would constitute at the outside 2.8 
per cent of all cases counseled that year. 

The authors found it extremely difficult to 
determine from records whether the 
problem lay chiefly within the counselee or 
with the handling in the Center. In the analy- 
sis of the raw data there seemed to be a definite 
overlapping between categories 1 and 7, that 
is, the unwillingness or inability to face self- 
evaluation or take steps towards self-evalua- 
tion on the part of the counselee, and inade- 
quate handling on the part of the counselor. 
Classification was relatively easy, based on the 
full statements of the counselors, but in the 
authors’ opinion the counselor in some cases 


these 


took too much responsibility and in some cases 
too little. It is also impossible to know whether 
in a particular case in which a counselee ex- 
hibited hostility, anxiety, aggression, fear, lack 
of responsibility, etc., a different counselor, or 
the same counselor at a different time, would 
have handled the situation in a manner sufh- 
ciently different to establish the kind of relation- 
ship which would have caused the counselee to 
continue. Some suggestions in this area are de- 
rived from an analysis of those cases who re- 
turned after discontinuing, to be 
later. 


discussed 


When reviewing the records to establish the 
above categories, the authors were impressed 
by the recurrence of certain facts and com- 
ments. These observations suggested a number 
of over-all tabulations. For this purpose, 16 
cases with no information or insufficient in- 
formation available were necessarily excluded. 
These tabulations are therefore based on the 
remaining 94 cases, and are in all probability 
minimal. There may have been many omis- 
sions of designation or mention of factors tab- 
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ulated. These results therefore can be con- 
sidered as, at the most, suggestive. 

A factor which appears to have considerable 
bearing on discontinuance of contact is the 
nature of the referral. It was found that of the 
94 cases, 41 or 43.6 per cent had been referred 
for counseling. By comparison only 28 per cent 
of the total intake for the year were referred. 
Moreover, of these 41 can be 
characterized as “authoritative” referrals either 
by the Veterans Administration or by Univer- 
sity authorities, being the 


was de- 


referrals, 39 


‘authoritative’ in 
sense that an administrative decision 
pendent upon the report of counseling. 


Twelve of the 94 cases received a neuropsy- 
chiatric discharge from military service, or at 
the time of counseling required and were re- 
ceiving psychotherapy. 

In 41 cases of the 94, the counselor’s notes 
specifically mention personal or family prob- 
lems, or both, as pressing at the time of coun- 
seling. All of the 12 cases of neuropsychiatric 
discharge or active psychotherapy are not in 
this group because in some of them previous or 
present problems were well enough in hand 
that they were not evidenced in the early 
Whether the 
cance for discontinuance of counseling resulted 
from the pressures of personal or family prob- 
lems, or from discussing them in counseling, or 


counseling interviews. signifi- 


from discussing them too early in counseling, 
is not determinable from the present data; nor 
do we have an objective comparison available 
with the number of counselees who made early 
mention of such pressing problems and con- 
tinued in counseling. It is, however, the au- 
thors’ impression that in the group who dis- 
continued, there is an undue proportion of early 
discussion of severely stressful family and emo- 
tional problems. 

In 7 of the 94 cases it was specifically men- 
tioned that in getting in touch with the veter- 
an, contact had been made with some member 
of the immediate family, usually mother or 
wife, or an identifiable message had been left. 
Each of these 7 had voluntarily requested 
counseling. It was the counselor’s belief that 
the family’s knowledge of the counselee’s con- 
nection with the Center occasioned family 
pressure, which, because it may have been 
closely related to the counselee’s problem, 
caused him to react negatively to continuance 
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with counseling. This finding appears to have 
implication for procedure in maintaining con- 
tact with counselees. It would seem that with 
late adolescent and adult counselees who have 
themselves initiated a request for service, it is 
an abrogation of their confidence even to in- 
form members of their household that they 
have done so. 


In 8 cases, the counselor attributed attitudes 
of “hostility,” “aggressiveness,” ‘‘noncoopera- 
tion,” “resistance” to the counselee as clearly 
apparent in the first interview. 

In 15 cases, the counselor indicated his own 
inadequacy in some respect in handling the 
counselee. Mentioned were such matters as the 
involvement of counselor’s feelings, counselor’s 
insecurity, poor handling, etc. Only 3 or 4 of 
the 14 counselors made such statements. 

Voluntary Return After Discontinuance. It 
was found that in some cases those who discon- 
tinued counseling and had been considered 
“lost,” later returned voluntarily. Seventeen 
or 15.5 per cent of the group of 110 returned 
prior to the present tabulations. 

A tabulation was made of the number of 
months elapsing in each of these cases between 
their discontinuance and their voluntary re- 
turn. Table 3 indicates the number of cases re- 
turning, according to interval of months, be- 
tween time of discontinuance and date of vol- 
untary request for a subsequent appointment. 


TABLE 3 
ReTuRNEFS AFTER DISCONTINUANCE BY LENGTH OF 
INTERVAL (Frnat Group oF 110) 
Months Months 
Interval N Interval N 
1 1 13 1 
5 14 1 
6 2 17 3 
7 1 18 1 
10 1 21 ! 
11 3 
Total 17 


It is interesting that approximately one- 
third returned after 16 months. 

Since this tabulation is being made only 16 
months after the final date included in the 
study, it is apparent that all information is not 
in, that others in this study may return in the 
future, and that therefore these are only par- 
tial figures. 

The subsequent interviews gave some clari- 
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fication regarding reasons as seen by the coun- 
selees for their previous discontinuance. A few 
general statements can be made. Many of these 
counselees had remained in their current train- 
ing or employment status, and had made a de- 
cision against changing, or a decision to post- 
pone a change. According to a number of their 
statements, they had not realized that they had 
discontinued, and they felt that they had made 
a decision at the time and then returned when a 
particular need arose. Three cases returned 
for counseling to reactivate their application 
for benefits under Public Law 16, when their 
employment, which they really preferred to 
training, terminated. It was felt that in these 
cases if another opportunity for employment 
arose they would again accept it, and probably 
again without notifying the counselor. 

Unfortunately, it was not possible to relate 
the statements of the counselees in these sub- 
sequent interviews to the counselor’s opinion of 
the reason for the original dropout, because in 
most cases the counselor had had access to this 
material and included it in the formulation of 
his opinion. 

DISCUSSION 


This study was fraught with all the diffi- 
culties attendant upon analysis of data not spe- 
cifically and directly gathered or designed for 
this purpose. Such difficulties, however, point 
the way towards the planning of research in 
which some of the liabilities in the present 
study would be minimized. 

The chief problem, of course, was with the 
records. It is the practice of this Center to keep 
running notes for each interview, as well as ex- 
tensive and detailed summaries of case history 
material, progress of counseling, and planning. 
Summaries of cases, however, will not compare 
with process recording or wire recording for 
obtaining knowledge of what actually hap- 
pened in an interview. Unless there is relative- 
ly full knowledge both of interview content 
and relationship interaction in an interview, it 
is not possible even to hazard a guess as to 
what may have caused the termination or in- 
terruption of the relationship. 

An area in which the records were particu- 
larly deficient was that relating expression of 
personal and emotional. problems to tecnniques 
used in counseling. There was enough mention 
of such problems to indicate that recording 
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should be particularly complete in this area. 

It was never known whether the counselee’s 
emotional problems, as evidenced in his first 
contact with the Center, were so great that no 
other handling would have induced him to re- 
main with the counselor, or would have en- 
abled him to make more genuine progress. 

Similarly, there was lacking in the records 
that kind of information which could assist in 
determining why in one case circumstances re- 
sulted in discontinuance, but similar circum- 
stances and attitudes in other cases did not nec- 
essarily result in the same behavior. Such de- 
termination would require the utmost detail in 
the recording of subtle attitudes, reactions, and 
interactions both of counselee and counselor. 


Some counselors tended to give many reasons 
or many suggested possibilities; others very 
few. Some counselors tended to state their an- 
alysis in purely psychological terms; some in 
purely situational ; some in a combination. Bet- 
ter direction of counselor thinking, rather 
than a free situation for analysis, might at 
least secure responses in more standard and 
universal terms. 

It was never known whether the reasons 
expressed by the counselor were projected, nor 
to what extent they might be. It was especially 
never known, as has been stated previously, 
whether some counselors accepted too much or 
too little responsibility, nor in what proportion 
the responsibility should have been placed with 
the counselee, the situation, or the counselor. It 
is imperative for future research to consider 
and deal with this problem. 


The criterion used for discontinuance may 
have been a totally or largely artificial one, and 
better and more valid criteria need to be 
evolved in future studies. If the criterion for 
discontinuance is leaving counseling before a 
solution is reached, it is necessary then to de- 
termine what constitutes a solution of the pre- 
senting problem. There is also the question as 
to what constitutes maintenance of counselor- 
counselee relationship. 


We may also question the establishment of 
criteria for categorization with the weight be- 
ing on the “making” of a decision without re- 
gard as to whether the decision made was an 
advisable or inadvisable one from the stand- 
point of the welfare of the counselee. Serious 
consideration should be given this matter in 
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future studies. 

Regardless of completeness of records, there 
remains difficulty in objectifying what is essen- 
tially subjective material. In the case of the 
present study, the authors have wondered at 
times about the justification for fitting every 
case into an objective category. It should also 
be pointed out that their impressions, derived 
from review of the records, were not subjected 
to objective validation. 

It would appear desirable to obtain inde- 
pendent evaluations from counselor and coun- 
selee regarding the cause or causes of discon- 
tinuance. Obtaining them, however, poses still 
more complex problems of methodology, many 
of which are obvious. How to obtain such in- 
formation reliably, and by whom, are very dif- 
ficult questions. However, it is important to 
recognize whose point of view is employed in 
depending upon subjective data, whether it be 
that of the counselor, counselee, or interviewer. 

It is hoped that future studies will progress 
towards distinguishing which of the factors 
related to discontinuance are implicit in the 
population, which stem from the nature of the 
referral, and which result from the counseling 
itself. 

SUMMARY 

The present study was conducted to deter- 
mine the extent to which counselees voluntarily 
discontinued contact during the counseling 
process, and the factors related to such discon- 
tinuance, as a method of evaluating counseling. 


A review of records of the 2,367 veteran 
counselees of the University of California, 
Berkeley, Counseling Center for the calendar 
year 1947 listed 140 or 5.9 per cent of the total 
as “incomplete.” After the counselors elimin- 
ated those which had not discontinued, 110 or 
4.6 per cent remained. On careful analysis of 
these cases, it was found that counseling had 
progressed satisfactorily as far as the counselee 
desired, or his circumstances permitted, in all 
but 58 cases, or 2.8 per cent of all cases coun- 
seled that year, a surprisingly small figure. 


Of the 110 cases originally designated as 
“incomplete,” 17 or 15.5 per cent returned 
prior to the present tabulation ;? approximately 


1It is interesting to note that since the prepara- 
tion of the manuscript, eleven more of those in the 
final dropout study have returned, bringing the 
total to 28 out of 110. For these eleven, the inter- 
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one-third returned more than 16 months after 
discontinuance. 


Factors considered in relationship to discon- 
tinuance were: 


1. 8.4 per cent of Public Law 16 cases dis- 
continued as compared with 5.5 per cent of 


Public Law 346 cases. 


2. Approximately one-half dropped out be- 
fore the second interview and one-half com- 
pleted at least the second interview. 

3. Although students comprised 92 per cent 
of the total group counseled for the year, they 
were only 79 per cent of the dropouts. 


4. There appears to be no marked associa- 
tion between discontinuance and academic 
periods. 


5. It appeared that counselor inexperience in 
the mechanical aspects of records handling, 
rather than professional competence or incom- 
petence, resulted in the technical designation 
of “discontinuance.” 


6. Primary reasons for discontinuance, as 
stated by the counselors, and categorized by the 
reviewers, were, in this order: 

a. Counselee’s unwillingness or inability to 
face self-evaluation or take responsibility for 
taking steps toward a solution. 

b. Planning had progressed to a satisfac- 


vals between interruption of counseling and re- 
sumption distributed as follows: 15 months—1, 23 
months—3, 26 months—1, 28 months—1, 29 months 
—1, 31 months —1, 32 months—1, 34 months—1, 38 
months—1, Total—11. 


tory solution, resulting in employment, train- 
ing, or psychotherapy. 

c. Public Law 16 application was suspen- 
ded. 

d. Counseling was interrupted by circum- 
stances beyond control. 

e. The counselee left the University. 

f. The purpose of counseling was not clear 
to the counselee. 

g. The counselor handling was inadequate. 

h. The Veterans Administration handling, 
or Veterans Administration technicality, was 
at fault. 

i. Delay in obtaining an appointment neces- 
sitated solution outside of counseling. 

7. Subsequent interviewing of returnees in- 
dicated their feeling in general that a decision 
had been reached and that they had not in fact 
discontinued counseling. 

8. 43.6 per cent of the dropouts were refer- 
red for counseling as compared with only 28 
per cent of the total counseled group. All but 
2 of the referrals can be characterized as “‘au- 
thoritative.”’ 

9. There appears to be some relationship 
with the following conditions: existence of a 
discharge for neuropsychiatric reasons; pres- 
sing personal problems at the time of counsel- 
ing; counselor contact with members of the 
family during follow-up of the counselee, thus 
informing them that the counselee had come 
for counseling; clearly evident counselee resis- 
tance during the first interview. 


Received November 30, 1949. 
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A SIX MONTH FOLLOW-UP OF THE EFFECTS OF PER- 
SONAL ADJUSTMENT COUNSELING 
OF VETERANS’ 


MARION R. BARTLETT 
VETERANS ADMINISTRATION 
WASHINGTON, D. C. 


HE primary difficulty in making an eval- 

uation of the results of psychological 

counseling is that of establishing reli- 
able criteria against which to measure the ef- 
fectiveness of the method used. The study here 
reported attempts to make use of the method 
of the observable changes in typical behavior 
in everyday situations, as reported by impartial 
observers. Training Officials responsible for 
supervising the adjustment of veterans during 
the course of training were asked to rate those 
referred for brief therapeutic counseling in 
terms of degree of improvement after follow- 
ing the case for at least six months subsequent 
to counseling. 

Subjects. Veterans in rehabilitation training 
who were referred to the “Personal Adjust- 
ment Counselor” by Training Officials because 
of difficulties judged to be due to emotional 
factors, were the subjects in this experiment. 
Typical of such personality difficulties, as re- 
ported nontechnically by the Training Off- 
cials, were such things as: excessive worry, ir- 
ritability, quarrelsomeness, indecision, lack of 
concentration on study, alcoholism on the job, 
overaggressiveness or insolence in the training 
situation, progress much below ability level, 
absenteeism, and a vast gamut of related diffi- 
culties. Of all the cases so referred, 11 per cent 
were further referred for treatment under the 
team set-up of the mental hygiene clinic, which 
is available for veterans of this type because they 
show severe personality disturbance, but these 
cases are omitted from our study because we 
are here interested in the effectiveness of per- 
sonal adjustment counseling alone as the re- 


1Acknowledgment is made of indebtedness for 
the contribution of Dr. Lorenz Meyer and Miss 
Mildred Swearngin in the preparation of the data 
for this study. 
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medial factor. Our subjects therefore comprise 
veterans whose difficulty with adjustment in 
regard to their training situation indicated mi- 
nor personality disturbances. 

Method. Training Officials responsible for 
supervising veteran trainees before, directly af- 
ter, and six months or more subsequent to 
counseling, were asked to evaluate the veterans’ 
progress in terms of “Much,” “Some,” and 
“No” improvement. An unselected sample of 
498 cases of the type just delineated were thus 
rated six months or more after counseling. 

, all of whom had 
had in-service training in nondirective tech- 


A total of 38 psychologists 


niques, in addition to a minimum of three years 
of other clinical experience, carried on the 
therapeutic counseling relying largely but not 
wholly upon the nondirective techniques. 

Results. As rated by Training Officials ob- 
serving monthly progress in the training situa- 
tion, the following results were found: Of the 
total 498 cases, 191 were rated in the “Much 
Improved” category, that is, 38 per cent of the 
total were rated as “much” improved; those 
rated as having made “Some Improvement” 
constituted 216 out of the 498 cases or 44 per 
cent of the total were considered “‘some” im- 
proved; and 91 veterans out of the 498 cases 
were rated as having made “No Improvement” 
subsequent to counseling, that is, 18 per cent 
were considered to have made “no” improve- 
ment. Pooling categories I and II, namely 
those rated as showing “Much” and “Some” 
improvement, we have on the asset side 82 per 
cent of the cases showing improvement, and on 
the liability side 18 per cent who rated as hav- 
ing made “no” improvement. 

A further treatment of the data was made 
by analyzing them in such a way as to relate 
the number of interviews to the rated degree 
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of improvement. The average number of inter- 
views with veterans rated in the “much” im- 
provement category was 5.2; those in the 
“some” improvement category 4.0; and those 
in the “no” improvement category was 2.0. 
This observed positive relationship between 
the number of interviews and the rated degree 
of improvement was further analyzed statis- 
tically. Using the x? formula, a chi-square of 
53.0 was found, showing that there is less than 
1 chance in 1000 that the relationship between 
number of interviews and degree of improve- 
ment is due purely to chance factors. Further 
analysis made to determine the coefficient of 
relationship between these two variables, using 
the curvilinear formula for epsilon as developed 
by Truman Kelly, showed a coefficient of .24 
between the number of interviews and the 
rated degree of improvement. 


Interpretations. We have used as a method 
of evaluating the effectiveness of counseling 
that of the observation of the everyday behavi- 
or of veterans, by a person other than the one 
carrying on the counseling. The Training Off- 
cial, in his function of supervisor of veteran 
trainees, was in touch with instructors and em- 
ployers responsible for the actual day-by-day 
training and changes in attitude and in interest 
in the training situation were readily observ- 
able to him. Since it is a matter of judging 
overt behavior and school progress, rather than 
subtle changes in mental attitudes, and since it 
is a matter of “doing the job,” the Training 
Official is perhaps placed in a favored position 
for making a relatively objective judgment. 
Since the judgment of “improvement” is a 
comparative one, and 82 per cent of the 
cases continued to show improvement as much 
as six months subsequent to the counseling, we 
may probably assume that counseling received 
at a time of crisis or at a crucial point in the 
veteran’s training helps him “over the hump,” 
and that the insight received from the counsel- 
ing provides a continuing stabilizing infiuence 
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in his adjustment. Conceivably, even a relative- 
ly small amount of insight at such a crucial 
point may, in the course of time, have a prevent- 
ive function of some weight. 

On the liability side of the picture, we find 
that 18 per cent of the veterans did not respond 
to the personal adjustment counseling. Of 
these, 40 per cent did not even return for a 
second interview, and an additional 30 per 
cent had only two interviews. In other words, 
the counseling in general did not “take” with 
these clients, either because they were unable 
to cooperate, or conceivably because of the in- 
eptitude of the counselor in obtaining the nec- 
essary rapport. (The number of interviews 
given any particular client in our set-up was 
entirely dependent upon the desire of the cli- 
ent.) The number of counseling failures, ap- 
proxiinately one-fifth of the cases, is probably 
relatively small in view of the well-established 
clinical fact that it is impossible to help all 
those who need aid. 

Conclusions. The findings of this study on 
the effectiveness of personal adjustment coun- 
seling with veterans having minor maladjust- 
ments may be summarized: 

1. Four-fifths of the veterans referred for 
personal adjustment counseling were rated as 
showing improvement six months after its com- 
pletion. 

2. Two-fifths of the cases were estimated to 
have made much improvement. 

3. Approximately one-fifth of the cases failed 
to show any improvement. 

4. Within the limits shown in our study, 
the greater the number of counseling inter- 
views, the greater were the chances for im- 
provement. 

5. From the positive findings it may be in- 
ferred that the techniques, primarily nondirec- 
tive in nature, proved to be well suited for the 
brief psychotherapy afforded cases with minor 
maladjustment. 


Received November Z 1949. 





THE PREDICTABILITY OF SCHIZOPHRENIC PER- 
FORMANCE ON THE RORSCHACH TEST 


JULES D. HOLZBERG 
AND 


MURRAY WEXLER 


CONNECTICUT STATE HOSPITAI 


SYCHOLOGY has long recognized the 

importance of reliability in evaluating a 

testing instrument. Consequently, it is 
accepted that a psychological measuring instru- 
ment that is incapable of demonstrating high 
standards of reliability is unacceptable as a 
clinical tool. Where studies of a patient are fre- 
quently based on a single contact, the clinical 
psychologist must be assured of measurements 
that are consistent with what would result on 
retest. 

The Rorschach is the clinical psychologist’s 
most important diagnostic technique. However, 
the studies of the reliability of the Rorschach 
have been markedly contradictory. Bell [1] 
has reviewed these studies and the results are 
not impressive. Kerr [5], Swift [7], and Kim- 
ble [6], using the test-retest method of deter- 
mining reliability, report poor reliability find- 
ings. On the other hand, Fosberg [3], in a care- 
fully controlled study, found very reliable re- 
sults under test-retest conditions. Where the 
split-half technique of measuring reliability 
was used, similar disagreement was found. 
Vernon [8, 9] found low reliability, while 
Hertz [4] reported good reliability. 


It is therefore clear that the Rorschach test 
is still in great need of careful studies of reli- 
ability. A serious question is then raised. If the 
studies of reliability of the Rorschach with nor- 
mal subjects do not establish satisfactory re- 
liability, what is the effect of the use of an in- 
strument of unknown reliability with abnormal 
populations? More specifically, what is the re- 
liability of the Rorschach test when used with 
a schizophrenic population, a population which 
is clinically accepted as being ‘“‘unpredictable” 
both in its behavior and in its test performance? 
Wechsler [10] has described the schizophre- 
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nic’s performance on the Wechsler-Bellevue 
Scale as “unpredictable” and Bochner and 
Halpern [2] have similarly described the 
schizophrenic in his Rorschach performance. 
Others have reported this unpredictability in 
terms of the failure of easy items with the con- 
current passing of hard items within a test. 
While this matter of unpredictability has been 
used to describe a patient’s performance with- 
in a test, the question is raised as to whether 
this process of unpredictability applies to the 
test-retest situation with schizophrenics. It has 
seemed to the authors that the whole question 
of “schizophrenic unpredictability” requires 
systematic studies of reliability of tests with 
schizophrenics as the tested population. This 
must be done in order to establish the reliabili- 
ty of the various techniques now being used in 
the clinical examination of schizophrenics al- 
though reliabilities may have been satisfactorily 
established for these tests with normal subjects. 
This becomes significant because in most in- 
stances interpretations of test results are based 
on the single administration of a test. If the 
results of a test are to change significantly upon 
retest without concurrent change in the clinical 
picture, serious question is thrown upon the 
validity of the psychologist’s study of the schiz- 
ophrenic patient. 

This research study was therefore organized 
to determine the reliability of one clinical in- 
strument, the Rorschach, when used with 
schizophrenics and conversely to determine the 
presence or absence of “unpredictability” of 
the schizophrenic in his Rorschach perform- 
ance. 

SUBJECTS 


The subjects of this study were twenty 
schizophrenics of varying symptomatology who 
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were characterized by one constant factor, i.e., 
their chronicity. All had been mentally ill for 
at least one year, hospitalized for at least the 
same period of time, and on an open-ward stat- 
us for at least one month. The study was limi- 
ted to chronic schizophrenics in order to elim- 
inate the factor of change from test to re- 
test which might occur with acute schizophre- 
nic conditions. Furthermore. it was desired to 
eliminate any question of representativeness of 
results which might be raised in the examination 
of an acutely sick population. The selection of 
patients who were on an open-ward status for 
at least one month was based on the assumption 
that such patients would have reached a state 
of relative stability in their psychotic adjust- 
ment and therefore any changes on test to re- 
test were not likely to be a function of changes 
in their psychiatric status. 

The subjects consisted of nine men and 
eleven women, whose ages ranged from 26 to 
49 years with a mean of 38.9 years. The mean 
period of duration of illness was 11.6 years, 
with a range of from one year and four months 
to twenty-four years and five months. Length 
of hospitalization ranged from one year and 
one month to eighteen years and seven months, 
with a mean of 7.9 years. The subjects were on 
an open-ward status from one to six months, 
the mean being 3.8 months. 

All types of schizophrenics were represented 
in the study. Nine were diagnosed hebephrenic, 
five paranoid, three catatonic and three simple. 


TESTING PROCEDURE 


The Rorschach was administered to the sub- 
jects according to standard procedure. There 
was no testing of limits in order to eliminate 
the possibility that any responses might be sug- 
gested to the subjects who might then carry 
them over to the retest. Three weeks after 
the Rorschach test was first administered, it 
was again given under standard conditions. 


STATISTICAL PROCEDURE 


Two major statistical methods were applied 
to the data. The first was the correlation of 
significant Rorschach scoring categories and 
relationships on test and retest. Secondly, re- 
liability of the differences between the means 
of these factors was established. 

In computing the correlation of significant 


Rorschach factors, Pearson’s product-moment 
coefficient of correlation was utilized. The re- 
liability of each correlation was then deter- 
mined. 

Reliability of differences between means was 
determined by the usual statistical methods. 
Means of the significant Rorschach factors for 
test and retest were derived. The standard de- 
viation, standard error of the mean and the 
standard error of the difference between 
means were computed. From this, the ¢ for the 
Rorschach scoring categories and relationships 
was determined. 


RESULTS 


The correlations for the various Rorschach 
factors considered in this study are presented 
in Tables 1-4. The ¢ values indicate that all 
location scoring categories show significant 
correlations’ at the five per cent level of con- 
fidence (Table 1). Most of the determinants 
show similarly significant correlations (Table 
2). All content with the exception of one (ab- 
stract) show significantly high correlations 
(Table 3). Most of the lower correlations ap- 
pear to be a function of the infrequency of oc- 
currence of these categories, as well as the 
narrow ranges of scores. All Rorschach rela- 
tionships show high correlations except two 
(D% and VIII-X%) (Table 4). Thus, 
it would seem that with the exception of a few 
Rorschach factors, the majority correlate high- 
ly from test to retest, i.e., the subjects tend to 
retain their same relative rank from one testing 
to the other. 

Tables 5-8 indicate the reliability of differ- 
ences between means of the various Rorschach 
factors studied. Only one factor is significant 
(at the five per cent level) and that is the 
average reaction time to the chromatic cards 
(Table 8). Thus, no significant mean differ- 
ences appear from test to retest except for this 
one factor. 

Results of these two statistical analyses may 
then be summarized as revealing, on the whole, 
significant correlations between Rorschach fac- 
tors with insignificant differences between 
means, both results tending to indicate little 
significant change from test to retest. Quali- 
tative analysis of the data, however, indicated 
fluctuations from test to retest and in several 


%# value at the five per cent level of signifi- 
cance is 2.093. 
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TABLE 1 TABLE 5 


RELIABILITY OF CORRELATIONS FOR RORSCHACH RELIABILITY OF DIFFERENCES BETWEEN MEANS OF 
LOCATION CATEGORIES RORSCHACH LOCATION CATEGORIES 
Location r t Location M; SD; M2 SDs: SE Diff. t 
W 95 12.86 W 4.15 3.10 4.98 8.16 13 ~=«1.08 
D 50 2.45 D 5.10 3.37 4.20 3.17 92 98 
d 90 8.75 d -90 1.34 76 26 81 48 
Dd-+s 67 3.82 Dd+S 2.30 


2.17 1.95 2.01 .55 -64 


TABLE 2 TABLE 6 


RELIABILITY OF CORRELATIONS FOR RORSCHACH RELIABILITY O} DIFFERENCES BETWEEN MEANS OF 
DETERMINANTS RORSCHACH DETERMINANTS 
Determinant ? t Deter- 
minant M SD; M2 SD» SE Diff t 
M 88 7.86 
FM 63 3.44 = ' ' 
es var FM 1.05 1.32 80 98 32 97 
+0 
m ‘ ‘ m ‘ 0 4( RG ‘ 
b 14.54 " 10 0 7 71 
K 68 3.93 ¢ ( 0 is 10 
I 83 6.31 } 4 } 4.4 Ay ! ( 
I 35 1.56 Rf R7 
I 3.02 ’ f RO l ‘ 4 
i lé 73 Fé { l 6 e 
- CI 1 66 4 1.00 
) if 
( an , . ‘ 
( 49 2.38 P ‘ 
Fe 32 1.43 , g4 : 
( 24 1.04 
TABLE 3 rABLE 7 
Ru BILI F CORRELATIONS FOR RORSCHACH NI ITY OF DIFFERENCES BETWEEN MEANS 01 
CONTENT RORSCHACH CONTENT CATEGORIES 
rj here y i Content M M D SE Diff 
H 60 3.18 = i8 
Hd l ) l 
id 94 11.68 . 
A 71 a7 . 
\ l 4.2) Ad ' 
Ad 72 4.28 A+ - 
‘1 9§ 20.89 I 7 f 
Pr 61 3.26 t F 1.36 
c 14 54 Z.18 t ; 
en R 20.29 h 40 4 l 
hh 58 3.02 Ab : . . I 
4 * * 
) $6 1.¢ 5 . . 
TABLE 8 
TABLE 4 RELIABILITY OF DIFFERENCES BETWEEN MEANS Of! 
IABILI OF CORRELATIONS FoR RORSCHACH RORSCHACH RELATIONSHIPS 
RELATIONSHIPS Relation- 
R S - P p M sD M2 SD SE Diff 
AC Ol bhi 
x I 1.95 4.86 1 1.39 2 
R 56 2.86 \ 7 40 r m on E 
W 84 6.56 D 1 9 ) 0 19.4 g 17 
b> 34 1.53 a% 4.9 i.9 5.75 8.5 66 30 
d 84 6.56 Dd 15.45 15.16 11.95 14.97 4 
Dd+S% 45 2.13 I ¢ 0 25.17 66.85 24.41 6.46 58 
I 71 4.27 F-+-% 7.95 24.71 66,25 84 R6 ‘ 
F4o% 84 6.56 A‘ 55 024.35 = 8.60 2.2 6.92 1. 
Sum ( 1.40 1.16 1.18 19 ? 69 
c 70 5 : 7 
A ges 4.1 VIII-X% 29.60 10. 2¢€ ; | 4.34 61 
Sum ¢ 6 3.82 Av. R.T 
VITI-X% —.17 73 Act 17.17 28.57 14,26 10.35 6.16 47 
Av. R.T. Ach. 68 3.93 Av. R.T 
Av. R.T. Ch 5 2.72 Ch 22.48 12.39 15.26 10.31 7 28 
P 76 4.96 P 3.50 1.91 3.25 1. 29 86 
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cases the changes appeared quite marked. De- 
spite the fact that statistical methods failed to 
reveal any significant changes, the question 
was raised whether the variations observed 
qualitatively were important and whether the 
techniques of statistical analysis used obscured 
them. Another consideration was that even if 
these changes were not statistically significant, 
they might be of such a nature as to influence 
clinical judgment in evaluating personality 
structure from test to retest. 

Because of this latter possibility, it was de- 
cided to determine whether these changes were 
clinically significant. Two trained Rorschach 
workers other than the authors were asked to 
select independently one of five retest Ror- 
schach tabulation sheets that matched each 
subject’s initial test tabulation sheet. The tabu- 
lation sheets consisted of a tabulation of loca- 
tions, determinants, content, and major Ror- 
schach relationships. Thus, each Rorschach 
worker made twenty choices, matching each in- 
itial test tabulation sheet with one of five retest 
tabulation sheets. The five choices of retest tabu- 
lation sheets were chance selections except that 
one of the five choices was always the correct re- 
test tabulation sheet. The results for this phase 
of the study revealed that both Rorschach 
workers were 85 per cent accurate in their se- 
lections (chi-square test shows this percentage 
to be very significant with chance occurrence 
less than one per cent). Seventeen out of the 
twenty matchings were made correctly. Two 
out of the three matchings that were missed 
were the same for beth workers. 


DISCUSSION 


The results of this study indicate that signi- 
ficant statistical changes cannot be demon- 
strated for Rorschach factors from test to re- 
test. The apparent variations observed quali- 
tatively are not significant clinically when 
trained Rorschach workers are asked to match 
test and retest tabulation sheets. It would thus 
appear that chronic schizophrenics are not “un- 
predictable” in their test-retest Rorschach per- 
formance but, on the contrary, are highly con- 
sistent as measured by clinical matchings and 
two statistical techniques, i.e., correlations and 
differences between means. 

“Unpredictability” on the Rorschach in 
terms of test retest performance then is not ap- 
plicable to a chronically ill schizophrenic popu- 
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lation which has had an opportunity to become 
stabilized. “Unpredictability” may, however, 
be descriptive of acute schizophrenic conditions 
on one of two bases: results on the Rorschach 
test may change from test to retest because of 
changes in the patient’s condition which occur 
rapidly in acute states or changes may occur 
because of the unrepresentativeness of the pa- 
tient’s performance on one or both testings. It 
is therefore imperative that the psychologist 
assume responsibility for interpreting Rorschach 
records of schizophrenics in the light of both 
the representativeness of the patient’s perform- 
ance and the degree of acuteness of the illness. 

The need, however, still remains for reliabil- 
ity studies of other psychological tests on 
schizophrenic populations, both chronic and 
acute, if the enigma of the “unpredictability” 
of the schizophrenic is to be completely under- 
stood. 

SUMMARY 


determine whether chronic 
schizophrenics are “unpredictable” in their 
Rorschach test performance, twenty chronic 
schizophrenics were tested and retested with 
the Rorschach test with a three-week period 
intervening between the two testings. The data 
were analyzed by two statistical methods. Cor- 
relations and differences between means indi- 
cated very few significant changes from test 
to retest. The agreement in the matchings of 
two trained Rorschach workers was extremely 
high and substantiated further the reliability 
of the Rorschach from test to retest with a 
population of chronic schizophrenics. 


In order to 


Received November 30, 1949. 


REFERENCES 


1. Bett, J. E. Projective techniques. New 
York: Longmans, Green, 1948. 


2. BocHNerR, RUTH, AND HALPERN, FLORENCE. 
The clinical application of the Rorschach test. 
(2nd Ed.) New York: Grune and Stratton, 
1945. 

3. Fosperc, I. A. Rorschach reactions under 
varied instructions. Rorschach Res. Exch., 


1938, 3, 12-38. 

4. Hertz, Marcuerire R. The reliability of the 
Rorschach ink-blot test. J. appl. Psychol., 1934, 
18, 461-477. 

5. Kerr, M. The Rorschach test applied to chil- 


dren. Brit. J. Psychol., 1934, 25, 170-185. 

6. Kimete, G. A. Social influence on Rorschach 
records. J. abnorm. soc. Psychol., 1945, 40, 
89-93. 





PREDICTABILITY OF SCHIZOPHRENIC RORSCHACHS 


Swirt, J. W. Reliability of Rorschach scor- 
ing categories with preschool children. Child 
Developm., 1944, 15, 207-216. 

VERNON, P. E. The Rorschach ink-blot test. 
Brit. J. med. Psychol., 1933, 13, 89-118. 
VeRNON, P. E. The Rorschach ink-blot test. 


10. 


399 


Ill. Brit. J. med. Psychol., 1933, 13, 271- 
295. 

Wecus_er, D. The measurement of adult in- 
telligence. (3rd Ed.) Baltimore: Williams & 


Wilkins, 1944. 








SZONDI'S PICTURES: 
EFFECTS OF FORMAL TRAINING ON ABILITY TO 
IDENTIFY DIAGNOSES 


ALBERT I. RABIN 


MICHIGAN STATE COLLEGE 


N a recent study [1], the present author 

has shown that a group of trained psy- 

chologists and a group of relatively (psy- 
chologically) sophisticated undergraduate stu- 
dents were able to identify correctly the diag- 
noses of comparatively large numbers of per- 
sons depicted in Szondi’s [2] test cards. Each 
of Szondi’s six sets of portraits, was projected 
on a screen, before each group of subjects. Ev- 
ery subject was supplied with a checklist of the 
eight diagnoses represented by the eight pic- 
tures of each set. The diagnoses were listed in 
the following order: 1. homosexual, 2. sadist, 3. 
epileptic, 4. hysterical, 5. catatonic schizophre- 
nic, 6. paranoid schizophrenic, 7. depressed, 
and 8. manic. The means of correct identifica- 
tions for the psychologists and students were 
12.3 and 10.7 respectively. In both instances 
the obtained means were better than chance 
expectancy to a statistically significant degree, 
the chance probability of correct identifications 
being one of each set or a total of six pictures 
out of the entire series of 48. 

The better performance of the psychologists 
led the author to conclude that “training does 
make a difference.”” This conclusion was also 
further substantiated by the fact that a larger 
number of portraits was correctly identified 
by statistically significant higher percentages 
of psychologists than students. This fact led to 
further experimentation which is subjected to 
description and evaluation in the present paper. 


PRESENT STUDY 


Essentially, the present investigation is a 
downward extension of the aforementioned 
experimentation ; i.e. relatively unsophisticated 
subjects, psychologically, were used. More- 
over, an attempt to test the effect of training 
in the same population was thought worthwhile 
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in checking, and possibly fortifying, the earlier 
results and the hypothesis which they support. 

The Szondi pictures were presented to a 
summer school class in abnormal psychology, 
in the manner described above. All 33 mem- 
bers of the group were asked during the first 
class session to check the diagnoses, represented 
in Szondi’s pictures, regardless of their famil- 
iarity with them. There was no reason to as- 
sume that these students were any more famil- 
iar with the nosological classifications than any 
other college student who had no special or 
formal training in psychiatry or abnormal psy- 
chology. Six weeks later, during the last session 
of the course, the experiment was repeated ex- 
actly in the same manner as originally pre- 
sented. Thus, the effects of introducing the 
variable, i.e. formal training in abnormal psy- 
chology including discussions of the sundry di- 
agnostic classifications, could be checked and 
investigated. 


RESULTS 


It is interesting to note that the mean correct 
identifications for the group of students at the 
beginning of the course was 9.1. This is con- 
siderably better than the chance expectancy of 
the six correct identifications for the entire 
series of pictures. The difference between the 
two means is statistically significant (¢ = 
5.48).7 It must, therefore, be stated that cor- 
rect identifications in the instance of even psy- 
chologically unsophisticated college students, 
is not a chance affair. Apparently many of the 
pictures fit the physiognomic stereotype accepted 
in our society for such technically uncompli- 
cated diagnoses as manic, homosexual, de- 


1The author is grateful to Mr. Wilson Guertin 
for his assistance in the statistical treatment of the 
data. 
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pressed, etc. As a matter of fact, the manics and 
homosexuals were the most frequently correct- 
ly diagnosed in the study referred to above [1]. 

Even though our group of subjects did bet- 
ter than chance in the very beginning, further 
training in abnormal psychology enhanced 
their diagnostic success with Szondi’s pictures. 
After the six weeks of training, the group’s 
mean of correct identifications was raised to 
12.5. This figure is highly significant statistic- 
ally, the ¢ of the differences being 13.95. The 
difference between this mean and the mean of 
earlier correct identifications is also statistically 
significant (¢ = 4.73). Thus, the conclusion 
derived from the data reinforces the earlier 
findings that training does make a difference 
in the ability to identify psychiatric diagnoses 
from pictures of patients. 

Of course, a note of caution is in order. The 
conclusion needs to be qualified by the follow- 
ing two factors: 

a. The given range of diagnoses was limited 
by those presented in the checklist. 

b. The findings hold for Szondi’s pictures 
which were selected from textbooks and other 
sources. These pictures probably represent the 
most “typical” physiognomies corresponding to 
the given diagnoses. 

As in the previous investigation, the second 
point of major interest is the relation between 
ease of identification and the diagnostic cate- 
gory or, even, between ease of identification 
and some particular pictures in the series. Here, 
therefore, the focus of attention is centered 
upon the pictures themselves and only secondar- 
ily upon the subjects who identified them. 

In Table 1 a comparison of average percent- 
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TABLE 1 
AVERAGE PERCENTAGES OF SUBJECTS WHO IpENTI- 
FIED CORRECTLY PICTURES IN 
EacH CATEGORY 
Szondi ist 2nd 
Categories Session Session 
Homosexual 37.9 47.0 
Sadist 19.2 20.2 
Hysterical 16.0 15.7 
Epileptic 12.6 21.2 
Catatonic 10.3 19.3 
Paranoid 25.4 27.2 
Depressed 21.3 26.8 
Manic 24.0 55.5 


ages of subjects making correct identifications 
in each category is made. With one minor de- 
viation (hysterical category) the general trend 
is that of increased percentages upon reexamin- 
ation. In the manic, homosexual, and catatonic 
categories, the increment is particularly notice- 
able. However, for a more detailed view of the 
changes in the ability of the students to identi- 
fy the diagnoses correctly, one needs to turn 
to the next table. 

Chance expectancy of correct identification 
of a diagnosis in each set which includes eight 
Szondi pictures is one-eighth, or 12.5 per cent 
of the populations attempting it. ‘Thus, consid- 
erably larger percentages of the population 
need to identify a picture correctly in order to 
judge the identifiability, i.e. ease of identifica- 
tion, of a picture to be statistically significant. 
The ?#’s of the differences between the actual 
percentages of the experimental group correct- 
ly identifying the pictures and the chance per- 
centage of 12.5 were determined for each of 
the 48 pictures, for both test periods. Table 2 


TABLE 2 


Pictt 


res Accorpinc To Set AND D1AcNnosis IpeENTIFIED CORRECTLY 


BY SIGNIFICANTLY 


LARGE NUMBERS OF THE POPULATION 


I II III 

ime 2 s 2 
Homosexual X ~~ x 
Sadist - 
Hysterical 
Epileptic X 
Catatonic xX 
Paranoid . or 
Depressed xX -& ae 
Manic xX | oe xX 








IV Vv VI Total 
: ao : 3 ; 3 1 2 
aR: -_. 2 xX 5 4 
xX 0 2 
= a 1 1 
xX 0 2 
0 1 
a 2 2 2 
2 2 
X = 2 2 5 
12 19 
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gives a summary of the data on the individual 
pictures representing different diagnoses and 
appearing in the six sets of the Szondi test. The 
cross marks (X) under / indicate the pictures 
identified by large enough percentages to show 
significance at the 0.1 per cent level of confi- 
dence during the original testing period. ‘The 
marks in columns 2 under each set headed by 
roman numerals, indicate identification of the 
corresponding diagnosis with the same degree 
of significance during the second experimental 
session. All categories but the homosexual 
show increases in the numbers of pictures iden- 
tified by statistically significant percentages of 
our population. The total of 12 pictures identi- 
fied by significantly large numbers in the first 
test administration increased to 19 in the se- 
cond one. Moreover, a number of pictures were 
identified by large enough numbers the second 
time to approach significance, as may be in- 


ferred from the average percentages reported 
in Table 1. 


It is also interesting to note that the more 
easily identifiable diagnoses (homosexual and 
manic) in the present experiment are the same 
ones which were easily identified in the experi- 
ment alluded to above [1]. The differences in 
the remaining diagnoses are not sufficiently 
marked to deserve comment. 


DISCUSSION 


An examination of the data, therefore, clear- 
ly indicates that in some commonly familiar 
diagnoses (particularly homosexual) the physi- 
ognomic stereotype has considerable validity. 
Moreover, it also became evident that the or- 
dinary basic information included in a course 
in abnormal psychology enhances considerably 
the ability of college students to judge diag- 
noses from appearance recorded photographi- 
cally. The number of pictures identified by sig- 
nificantly large numbers rose from 12 to 19, 
thus approaching the 21 pictures identified by 
psychologically sophisticated students, but still 
far from the mark reached by psychologists in 
service who identified 28 [1]. 


As mentioned above, these portraits were 
probably selected for their “typicalness’ and 
fitness to illustrate the “textbvok picture” of 
the diagnoses. However, the significant fact 
emerges that many diagnoses con be correctly 
identified on the basis of pictures of patients, 
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even without the observation cf the patients’ 
behavior, speech, posture, etc. It is reasonable 
to assume that the success of identification 
would be considerably enhanced if such addi- 
tional clues were obtained from Jive patients in 
face to face situations. Thus, we are led to re- 
iterate the comment made in connection with 
the earlier study that the results cast some 
doubt upon the validity of some of the 
“less objective’ psychological procedures which 
give the clinician considerable freedom in 
diagnosis and interpretation. In those face 
to face situations it is difficult to partial 
out the role played by the test or tech- 
nique itself and that played by the minimal, 
probably not fully conscious, clues from the 
patients’ appearance and behavior, absorbed by 
the examiner in the clinical situation. The pres- 
ent findings and inferences drawn from them 
point out the need for more precise “examina- 
tion of the examiner” and the various “impres- 
sions’ gained during the interview which in 
combination with the test findings determine 
the final diagnosis. This “soul searching” of 
the clinician is needed in order to determine 
the scientific value of his procedures and instru- 
ments as against his own function as a yet un- 
calibrated instrument. 


Finally, some implications for the Szondi test 
itself may be considered. The obtained results 
support the notion that the Szondi pictures as 
stimuli have some meaning. The fact that the 
picture-stimuuli are identified with certain psy- 
chological syndromes is indicative that the por- 
trayed physiognomies express those syndromes 
to some extent. Consequently, abience or adi- 
ence expressed in regard to those pictures and 
presumably the psychological characteristics 
which they represent, may have some psycho- 
logical meaning. However, this bit of informa- 
tion is far from a validation of the personality 
descriptions offered by the Szondi proponents. 
Much further investigation is needed in order 
to accomplish this formidable task. 


SUMMARY 


The six sets of Szondi’s pictures were pro- 
jected on a screen for 33 college students at the 
beginning of a course in abnormal psychology. 
Each student was asked to check the correct 
diagnosis corresponding with each of the eight 
pictures in every set. The mean of correct iden- 
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tifications was 9.1. At the end of the course the 
same procedure was repeated and the mean of 
correct diagnostic identifications rose to 12.5. 
Both means are statistically significant when 
compared with chance expectancy. The differ- 
ence between these two means is also statistic- 
ally significant. ‘Twelve out of the 48 pictures 
were identified by significantly large percentages 
of the population. The number increased to 19 
upon reexamination. The findings reinforce a 
previous conclusion, based on different popu- 
lations, that training in psychology enhances 
the ability to identify diagnoses from pictures 


representing psychiatric patients. 


Some theoretical implications of the find- 
ings for clinical psychology in general and 
Szondi rationale in particular are also dis- 
cussed. 


Received November 17, 1949. 
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A TEST OF A BASIC ASSUMPTION OF THE SZONDI’ 
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UESTIONS about the value of some 

of the assumptions of Szondi undoubt- 

edly occur to the student of this meth- 
od. This paper represents some of the results 
of the writer’s attempt to evaluate the value 
of one basic assumption. 

Szondi assumes that each of the diagnostic 
categories has a prominent personality variable 
(called a need-system) associated with it. 
Hence, too few or too many selections within 
a diagnostic group (factor) implies a patholog- 
ical state (need-tension) in the corresponding 
need-system. A basic assumption underlying 
the application of the Szondi test is that the 
picture selections reflect the tensions existing in 
a subject’s need-system as categorized by Szon- 
di [1]. 

These tensions in the need-systems are con- 
sidered to be fairly stable in well-adjusted sub- 
jects, hence, retesting usually produces a pro- 
file similar to that obtained on the earlier ad- 
ministration. For a more complete description 
of the test, the reader is referred to Deri’s 


book [1]. 


STATEMENT OF PROBLEM 


The hypothesis to be considered here is that 
the eight Szondi need-systems fail to account 
adequately for test behavior. 


All who have used the Szondi test with nor- 
mals are familiar with the fairly high test-re- 
test agreement. An analysis of the nature of 
this agreement is vitally important. Are the 
same pictures selected upon retesting or is it 
that the same number of pictures is selected 
from the categories as on the initial test? 
Reasoning from the role of need-tensions, pic- 
tures selected will have the same distribution 
by categories upon the retest but should not 
necessarily correspond with those previously 

1This investigation was conducted in the Psycho- 


logical Laboratory of the Lincoln State School and 
Colony; William W. Fox, Superintendent. 


selected within each category. If a subject 
chooses as /ikes the hy’s in sets | and IV on the 
first test, then there should be a selection of 
two Ay’s upon retesting. These retest selections 
would not be likely to correspond with his 
earlier choices since only chance should deter- 
mine his selection within that category and 
there are four other pictures which he may 
choose. 

If the high test-retest agreement observed by 
the clinician is due to the selection of the same 
pictures upon retesting, the Szondi need-sys- 
tems have not accounted adequately for test 
behavior. 


PROCEDURE 


‘Twenty-four white male patients of the 
Lincoln State School and Colony served as 
subjects for this study. Those Dull Normals, 
who showed no obvious neuro- or psychopath- 
ology, were included in the study. 

The Szondi Test was broken down into 
halves for administration. The first three sets 
of stimulus pictures were used as a unit and are 
designated as “A” for reference. The last three 
sets were also used as a unit and are designated 
as “B.” 

One half of the Szondi was administered to 
each of the subjects the first day, and they were 
all retested 24 hours later with half the Szondi. 
The Szondi halves were given to the subjects 
in administrative order as shown in Table 1. 
The subjects were randomly assigned to four 
groups of six subjects each. 


TABLE 1 
ORDER OF ADMINISTRATION 
Istday 2nd day : 
Experimental A A 
B B 
Control A B 
B A 


Thus, 12 subjects were given the same half 
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of the Szondi on successive days (experimental 
group); and 12 subjects were given different 
halves of the test (control group). The tests 
were given in accordance with Deri’s explicit 
instructions [1, p. 9]. All subjects cooperated 
to their fullest. 

The results were considered in terms of 
agreements between selections. For example, if 
a subject chooses h,s as likes on the first day for 
set I and then selects s,p the following day for 
set I, he is credited with one agreement. If he 
reselected both h,s there would be two agree- 
ments for the likes of set I. The same proced- 
ure was used for dislikes as well. In the cases 
where a different half of the test is given upon 
readministration, the subject is credited with 
agreements when his choices fall in the cate- 
gory selected earlier and the set is in the same 
ordinal position of presentation. Two agree- 
ments would be credited for the above subject 
if he selected h,s from set IV, if he were re- 
tested with “B,” since set IV is the first set of 
“B.” The total number of agreements reported 
is the sum of those found for likes and dislikes 
for all three sets. The maximum possible num- 
ber of agreements is 12, which corresponds to 
the number of choices made on half a Szondi. 


RESULTS 


Table 2 presents the number of test-retest 
agreements for all subjects. The distribution of 
the total agreements for each subject, with ref- 
erence to the group within which they fall, is 
bimodal with no overlap. A chi-square test re- 
flecting the significance of this difference in 
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total agreements was made with a cutting line 
at six. The chi-square, with the Yates correction 
for continuity [6], is 20.2. So large a differ- 
ence between the two groups in test-retest 
agreements cannot be attributed to chance, 
since the level of confidence corresponding to 
the chi-square value with one degree of free- 
dom is less than one per cent. 


TABLE 3 
AGREEMENTS IN CHOICES ON SzOND!I RETESTING 
No. Possible Per Cent 
Group Order Agreements Agreements 
Experimental A-+-A,B+-B 144 70.83 
Control A+B,B+-A 144 25.00 
22.60 


Chance 816 


Table 3 presents the over-all frequency of 
agreements for the two groups. It will be noted 
that a quite high test-1ctest agreement of 70.83 
per cent was obtained by the experimental 
group as compared with only 25.00 per cent 
for the control group. 

To compare the low test-retest agreement in 
the control group with that which would oc- 
cur by chance, the pictures were selected by a 
random procedure. Numbers from one to 
eight were assigned to each Szondi category, 
and were recorded from a random number 
table. The first two numbers of the table repre- 
sented the likes, while the next two represented 
the dislikes. The next four numbers had the 
same meaning but constituted the retest. The 
number of agreements was calculated as men- 
tioned above. This procedure yielded a chance 
agreement of 22. 60 per cent. 


TABLE 2 
AGREEMENTS IN CHOICES FoR EACH SUBJECT 
Group. Order Likes Dislikes Total Group Order Likes Dislikes Total 
Exper. A+A 2 5 7 Control A+B 2 2 4 
4 5 9 3 0 3 
5 a 9 1 1 2 
a 4 “ 2 2 4 
2 a 6 a 1 5 
3 5 8 2 2 4 
Exper. B+B 5 4 9 Control B+A 1 1 2 
6 4 10 1 0 1 
6 3 9 1 0 1 
5 3 g 1 1 2 
6 4 10 2 3 5 
5 a 9 1 2 3 
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A chi-square test of the significance of the 
difference in agreement between the values ob- 
tained by chance and those of the control group 
was made through use of the cutting line. With 
the Yates correction a chi-square of 0.2 was ob- 
tained. With one degree of freedom this large 
a difference would be obtained less than 70 per 
cent of the time by sampling error and the null 
hypothesis is supported. 


DISCUSSION OF RESULTS 


The high picture-for-picture agreement seen 
upon retesting with the same Szondi pictures 
has not been adequately explained by the cur- 
rent rationale of need-systems. If a subject 
shows a fairly even distribution of factors, there 
is no reason to expect him to reselect exactly 
the same pictures on the retest. While one 
would expect approximately the same distribu- 
tions among the factors, the choice of the same 
pictures cannot be anticipated. It appears that 
special stimulus aspects of the pictures are be- 
ing ignored in current thinking about Szondi. 
It seems obvious, from the high test-retest 
agreement for the experimental group, that 
these special stimulus aspects play a large part 
in the determination of choices and merit more 
systematic attention. 

Recent studies by Harrower [3] and Rabin 
[4, 5] tend to support the hypothesis that spe- 
cial stimulus aspects are quite important in pic- 
ture selection. Harrower finds that preferences 
for specific pictures within a factor may be 
very marked (as high as 118:1 for choices of 
I:II among hy’s). 

There are many special stimulus aspects of 
the pictures that might determine a subject’s 
choice partially, yet not necessarily be related 
to Szondi need-systems. To illustrate, a subject 
might have a preference for mustaches. Among 
his like choices would be 1s, lhy, 2, 1p, 2m. 
In effect, these choices are almost randomly dis- 
tributed throughout Szondi’s need-systems, and 
therefore, bear no strong relationship to any one 
of them. Among the special stimulus aspects of 
the pictures, many may be related highly to the 
Szondi need-systems, while others may not 
show such a relationship. Apparently these 
need-systems were not established by a formal 
factor analysis, so there is no reason to expect 
them to be exhaustive or exclusive. 

Whether these special stimulus aspects are 
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important through the operation of memory 
or more directly through their valences is not 
apparent from this study. In spite of instruc- 
tions emphasizing the unimportance of mem- 
ory, it may play some part in the results. The 
influence of memory seems to be slight but the 
degree can be established only by further inves- 
tigation. 

A statistically significant difference in the 
test-retest reliability between the first and 
second halves of the Szondi was observed. This 
is mentioned as further evidence of the impor- 
tance of special stimulus aspects of the pictures. 

The reader is cautioned against interpreting 
these results as a refutation of either the Szondi 
method or Deri’s rationale. In a subsequent 
study the writer has found some evidence of 
validity for this method [2]. Weaknesses, cap- 
able of correction, are bound to appear in at- 
tempts to evaluate personality by new methods. 
It is the psychologist’s job to find them and 
further the theory development. This paper 
should serve merely as a guide for more ade- 
quate explanations of test behavior. The “per- 
sonal validations” in individual cases reported 
by clinicians suggest that the test shows prom- 
ise as a method of personality diagnosis. 

Utilization of the concept of need-systems 
seems to be a justifiable and desirable approach. 
However, it would appear that the Szondi fac- 
tors are too few and may have considerable 
overlap. Factor analysis might yield enough 
factors to explain the picture-for-picture re- 
test agreement adequately and still account for 
other aspects of test behavior. 


SUMMARY AND CONCLUSIONS 


1. The investigation conducted attempts to 
evaluate the adequacy of Szondi’s eight need- 
systems in accounting for test behavior. 

2. Twenty-four subjects of dull normal in- 
telligence were given half a Szondi and retested 
a day later with half a Szondi. 

3. Those who received the same half of the 
Szondi showed a high picture-for-picture agree- 
ment in their selections. Those who received 
a different half upon retesting showed chance 
agreement, as might be anticipated. Such re- 
sults could not be attributed to sampling error. 

4. This high picture-for-picture agreement 
could not be explained by current need-system 
theory. 


5. It is concluded that Szondi need-system 
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variables cannot be considered adequate for loadings on the Szondi test. J. clin. Psychol., 
explaining all test behavior. Factor analysis 1950, 6, 262-266. | . 
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THE F MINUS K DISSIMULATION INDEX FOR THE 
MINNESOTA MULTIPHASIC PERSONALITY 
INVENTORY 


HARRISON G. GOUGH 


UNIVERSITY OF CALIFORNIA, BERKELEY 


recurring problem in the use of any per- 
A sonality test is the question of dis- 

sembling. ‘The Minnesota Multiphasic 
Personality Inventory (MMP1) is one of the 
tests which makes explicit recognition of this 
problem by the inclusion of an internal set of 
validity indicators. Three of these, the ?, L, 
and F scores, have been well described in the 
test manual [4], and other articles [2 and 3, 
for example]. A fourth scale, K [6], may al- 
so be employed in the attempt to assess the de- 
pendability and trustworthiness of any ob- 
tained results. 


All four of these indicators, considered 
singly, will identify unreliable or malingered 
profiles with reasonable accuracy, but their 
maximum efficiency, apparently, is realized in 
combination. One of these combinations, F 
minus K, which was proposed by the writer in 
an earlier paper [3], appears to be the most 
promising index to date, and the present dis- 
cussion will be concerned with further appli- 
cations of the F—K index, and with normative 
data. 

It should be emphasized at the outset, how- 
ever, that at least three of these scales (L, F, 
and K) have important personalogical attri- 
butes and implications in addition to their 
function as validity indicators; that is, scores 
on these scales have relevance to the person- 
ality structure of the test-taker as well as to the 
reliability and meaningfulness of his test proto- 
col, per se. It is easy to make the mistake of 
overlooking the import of these scales in at- 
tempting to carry out a complete profile analy- 
sis. Accordingly, although the present paper 
will deal primarily with the validational as- 
pects of these scales, the discussion should not 
be interpreted as a denial or underemphasis of 
their other potentialities. 
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PREVIOUS STUDIES 


In the earlier paper on MMPI simulation 
[3] the efforts of a group of eleven clinical 
workers to feign two psychiatric syndromes 
were analyzed. The first syndrome was de- 
fined as “an acute, severe, anxiety neurosis 
which would lead to separation from the serv- 
ice, but not to commitment to a mental hos- 
pital,” and the second as “a nondeteriorated, 
acute, paranoid schizophrenic psychosis.” Skil- 
led judges were able to identify eight of the 
eleven simulated psychoneurotic patterns when 
intermixed with 68 authentic psychoneurotic 
records, and were able to identify all of the 
psychotic simulations when they were inter- 
calated with 24 authentic profiles. At the 
same time, a simple combination of the F raw 
score minus the K raw score was able to 
pick out ten of the eleven simulated records 
in each of the two situations. ‘The F—K cut- 
ting scores proposed at that time were plus 4 
and over for neurotic profiles, and plus 16 
and over for psychotic profiles. Either F, or 
K, utilized singly, was fairly successful in 
separating the feigned from the authentic 
profiles, but neither was as effective as the 
combination. 


Several other studies have since employed 
this F—K index. Hunt [5] found that a cut- 
ting score of plus 11 and over would properly 
identify a substantial proportion of records giv- 
en by subjects attempting to simulate psychiat- 
ric disorder, and would mistakenly identify 
about twelve per cent of the records of psy- 
chiatric patients in a veterans hospital. Hunt 
2lso extended the use of the F—K index in an 
attempt to detect records given by subjects try- 
ing to present an overly favorable test profile. 
A cutting score of minus 11 and below was 
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fairly effective in picking up records of U. S. 
Navy prisoners who had been asked to conceal 
any abnormality or maladjustment, but this 
cutting score also sorted out 93 per cent of the 
supposedly honestly produced profiles of a 
group of 109 ASTP students. Hunt concluded 
that F—K scores of 11 and over were highly 
suggestive of “faking bad,” but that more re- 
search would be needed before indices for the 
detection of “faking good” would be practi- 
cally serviceable. 


Sweetland [7] also used the F—K index in 
a study of MMPI test performance under 
hypnosis. Only a small proportion of the sub- 
jects instructed to simulate various psychiatric 
conditions under hypnosis were found to have 
F—K elevations above the critical levels pre- 
viously recommended. The subjects who did 
score above the critical levels were, in every 
case, the most deeply hypnotized. Sweetland 
used evidence of this kind to argue that the 
hypnotically induced neuroses were close to the 
“real thing,” and were not simulations in the 
conventional sense. Inspection of his profiles 
and tables suggests that the hypnotically in- 
duced depressions would be detected rather fre- 
quently by using a slightly lower F—K cutting 
score, but that the remaining dissemblings 
would elude this index. From this we might 
for the time being infer that hypnotic MMPI 
simulations are, indeed, undetectable for the 
most part; the consoling thought is that few 
of the subjects seen by clinical workers will be 
presenting themselves in trance states. 


Another study of dissembling was carried 
out by Cofer, et al. [1]. Three groups of col- 
lege students were used in this study, one of 
which took the MMPI honestly and then at- 
tempted to present an unfavorable, emotional- 
ly disturbed, picture; the second took the test 
honestly and then tried to give the best pos- 
sible impression ; the third took the test twice 
under normal conditions. For the “fake bad” 
group the best discrimination of records re- 
ported in the paper was that given by F alone, 
in which a cutting point of 20 (raw score) 
and over identified all of the dissembled proto- 
cols. A cutting point of 62 (T score) on K 
identified about 70 per cent of the falsified re- 
cords. For the “fake good” records, a combina- 
tion of K plus L gave the best results, correct- 
ly classifying about 74 per cent of the malin- 


gered profiles. An item analysis of these favor- 
ably dissimilated records against the same sub- 
jects’ normal protocols resulted in a scale of 
thirty-four items which worked very well on 
the original samples, and which gives promise 
of being a very interesting addition to the pool 
of MMPI keys. 

Cofer’s study made no mention of the F—K 
index, but a personal communication reveals 
that an F—K cutting score of 16 and over 
would correctly classify 26 of the 28 “fake 
bad” profiles, and would not misidentify any 
of the 28 honest records. A cutting score of 5 
and over would identify a// of the malingered 
cases, and would still not incorrectly classify 
any of the honest profiles. An F—K cutting 
score of minus 11 and below would detect 25 
of the 27 “fake good” cases, but would at the 
same time pick up 19 of the 27 honest rec- 
ords. The general results in regard to F—K 
are quite similar to those of Hunt’s study, in 
that profiles intended to simulate neurotic 
conditions are quite simply identified, but pro- 
files attempting to give an overly favorable 
picture are not so easily detected. 

A final bit of evidence on the efficacy of the 
F—K _ index is provided by an unpublished 
study by Charles Bird at the University of 
Minnesota. Bird had a class in abnormal psy- 
chology take the MMPI first under normal 
conditions, and then again in an attempt to 
simulate various specific diagnostic syndromes ; 
following the completion of the course they 
were again requested to simulate these same 
syndromes. Disregarding the second attempt 
at simulation, and pooling all of the first at- 
tempts irrespective of specific diagnosis, it ap- 
pears that a cutting score of plus 4 and over 
would detect approximately 81 per cent of the 
records turned in by students in the dissimula- 
tion situation, and would not misidentify a 
single record among the normal protocols ob- 
tained.” 

All of these studies certainly suggest that 
the F—K index is a useful indicator of the de- 


'1Bird’s students were noticeably better during 
their second simulation, as evidenced by a decrease 
in mean F—K from 17.19 to 9.86, giving a f-ratio 
of 8.87. Interestingly, however, the correlation be- 
tween first and second simulations (on F—K) was 
44, SE .06, indicating that the better simulators 
on the first trial were also, for the most part, su- 
perior on the second. 
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pendability and meaningfulness of an MMPI 
protocol. In order to colligate this material 
and make it more directly comparable, the 
writer has been able to procure the original 
data from the studies of Cofer and Bird,’ in 
addition to some new material,® and in the fol- 
lowing section various comparisons and nor- 
mative facts are presented. 


F-—K FINDINGS 


Table 1 gives the means and standard de- 
viations for the F—K distributions for the sev- 
eral samples included in this compilation. One 
clear-cut finding from this table alone is that 
all normal and clinical groups have F—K 
means of less than zero, whereas all dissem- 
bling groups have F—K means above zero. The 
differences between the dissemblers and all of 
the other samples are all highly significant. 
For example, the ¢-ratio of the difference be- 
tween dissemblers and adult normals is 29.54. 


TABLE 1 
F—K MEANS AND STANDARD DEVIATIONS 
Groups F : N M SD 


. College students 269 
Adult normals 691 


— 


-13.84 5.71 
8.96 6.97 


oo rw 


3. University psychiatric 


hospital patients, males 250 7.92 9.49 
4. University psychiatric hospi- 

tal patients, females 250 8.70 7.41 
5. Veterans Administration hos- 

pital psychiatric patients, 

males 100 7.08 8.12 
6. Army hospital psychiatric 

patients, males 213 2.78 10.17 
7. Experimental! dissemblers, 

total sample 319 18.76 16.08 

a. Army subjects 22 14.09 11.20 

b. Cofer’s subjects 28 41.75 13.18 


ec. Bird's subjects 269 


17.19 14.25 





One of the questions which arises from an 
inspection of Table 1 concerns the degree of 
susceptibility of the F—K index to factors of 
psychiatric normality or maladjustment as such. 
If the Army sample is excluded, which is 
probably justifiable in light of the special con- 
ditions prevailing during wartime and the fact 


2The writer wishes to thank Dr. Charles Bird 
and Dr. C. N. Cofer for their kindness in making 
their data available to him. 


8The records of the adult normals, and those of 
the University psychiatric hospital patients, were 
made available through the courtesy of Dr. Starke 
Hathaway. The protocols of the Veterans Hospital 
patients were very kindly furnished by Dr. George 
S. Welsh. 
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that many Army patients were under great 
motivational stress to maximize their com- 
plaints, and if the college students, who are 
generally known to give somewhat compulsive- 
ly favorable self-portraits and descriptions [see 
6, or any of the papers describing the develop- 
ment of the MMPI clinical scales]* are also 
excluded, it would appear that the clinical and 
normal groups are not differentiated on the 
basis of F—K. 

Table 2 does, in fact, substantiate this im- 
pression for the F test of the significance of the 
mean differences among the four groups con- 
sidered is not significant. This is a very helpful 
finding, for it suggests that F—K is not sus- 
ceptible to distortion on the basis of psychiatric 
abnormality as such, certainly a most valuable 
property in any index which is to be used in 
screening dissemblers from normals and from 
actual psychiatric patients. 


TABLE 2 
VARIANCE TABLE FOR TH" FOLLOWING 
UNIversITy HosprraAL MALeEs, 
Universiry HosprraAL FEMALES, VeT- 
ERANS HosprrAL CASES, AND 
ApULT NORMALS 


ANALYSIS 01 


GROUPS: 


Sum of Variance 


Source squares d.f. estimate 
Between 437.4418 3 145.814 
76,467.9076 1287 


Within 59.416 


Total 76,905.3494 1290 





F 2.454, P > 0.05 


The sampling distribution of F—K for the 
original random sample of 691 adult normals 
is presented in Figure 1. Although it is moder- 
ately positively skewed, inspection suggests that 
its departure from normality is not very great. 


4Although this study does not give extended 
treatment to the problem of “faking good,” the 
F—K _ statistics on Cofer’s positively malingering 
group were calculated, giving a mean of minus 
17.37, SD 5.39. The f-ratio of this “fake good” 
group versus Bird’s college students was 3.07, and 
of the college students versus the adult normals 
was 11.14. It thus appears that the college students 
tended toward the “fake good” end of the con- 
tinuum, but were not altogether in the positive 
simulation range. It should also be mentioned that, 
although the mean of Cofer’s positively dissem- 
bling group was significantly less than that of any 
other, there was still an almost complete overlap 
of cases. All of Cofer’s subjects were in the low 


end of the F—K curve, but they were, nevertheless, 
blanketed by the distribution of adult normals. 
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Fic. 1. Distribution of F — K scores for the 
In comparison with the rather pronounced 
skewness of both the F and K rawscore distri- 
butions considered separately, this approximate 
normality of the combined distribution is all 
the more striking. 


Another question which arises concerns the 
relative frequency of classification of profiles as 
malingered which would appear in the several 
samples as different F—K cutting scores are 
employed. These data are offered in Table 3. 
With F—K cutting scores of plus 7 and over 
most of the samples would show only a small 
proportion of cases called dissembled. The Ar- 
my psychopaths and psychotics would, how- 
ever, show a sizeable percentage of misclassi- 
fied cases. One possible explanation of this 
would be that many Army patients tended to 
exaggerate their complaints and exploit their 
difficulties in an attempt to facilitate medical 
discharge or to secure medical dispensation for 
a military offense. The proportion of dissem- 
bled cases actually detected drops gradually 
as the cutting score rises, but even at plus 16 
and over 58 per cent are identified. 

The final step in the analysis was to con- 
solidate all of the authenic profiles, clinical and 


adult normal sample (N 


25 8 Il 14 I7 20 23 


= Gs). 


normal, into a total sample of 1,773 cases, and 
to compare this sample with the dissemblers 
(319 cases) for the purpose of determining op- 
timum cutting scores. For this analysis the pro- 
portion of hits and misses for each sample was 
calculated, and the fourfold point correlation 
(phi) was taken as an indication of the degree 
of success in classifying records which would 
be achieved by any particular cutting score. 
Table 4 contains this material. 


From Table 4 it can be seen that the prob- 
lem of setting an F—K cutting score is one of 
minimizing false positives and false negatives. 
The highest phi coefficient is given by a cutting 
score of plus 9. This score would correctly 
classify 97 per cent of the authentic records 
and 75 per cent of the simulated records. If a 
clinical worker would prefer to use a cutting 
score such that, say, he would call authentic 
records malingered no more than five per cent 
of the time, but would still maximize the num- 
ber of simulated profiles he would call simu- 
lated, the best cutting score would be plus 7. 
An inspection of Table 4 will suggest other 
uses and other cutting scores for special situ- 
ations. 
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ProporTiON OF CAsEs IN EACH SAMPLE WHO WouLD Be CALLED DISSEMBLERS WITH 
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TABLE 3 


Various F—K Cuttinc Scorest 














Experi- 

Army Army Army Univ. Univ. VA Univ. mental 

psycho- neu- psy- hosp. hosp. hosp. Adult stu- dissem- 

F—K paths rotics chotics males females males normals dents blers 
Scores 29 153 21 250 250 100 691 269 319 
0 59 27 61 20 14 21 11 2 88 
1 52 23 55 17 il 17 8 1 88 
2 48 19 55 16 10 14 7 1 86 
3 48 17 $2 13 8 12 5 1 85 
4 48 13 52 12 7 il 4 ° 84 
5 48 11 48 10 5 11 3 = 83 
6 45 10 39 8 + 8 2 e 81 
7 38 8 39 7 3 8 2 ® 79 
8 28 8 32 7 2 6 2 © 76 
9 28 6 29 5 2 3 1 0 75 
10 28 5 26 o 1 3 1 72 
11 24+ 5 23 4 1 1 ° 70 
12 24 5 16 3 ° 0 ° 68 
13 24 4 16 3 ° 64 
14 14 3 16 2 a ° 63 
15 10 3 16 2 ” ® 61 
16 10 2 16 2 . ® 58 

+A cutting score of “2,”" for example, would mean that al! subjects with F—K scores of 2 or more would 


aad 


be classified as dissemblers. 
*Proportion is less than one, but not zero. 





SUMMARY 
Malingering and 


test dissimulation pre- 
sent problems in the clinical use of personality 
tests of all kinds. The validating scales of the 


SCREENING EFFICIENCY OF THE F—K INDEX 


TABL 


Authentic Profiles 


Minnesota Multiphasic Personality Inventory 
were developed to assist in coping with such 
problems. Previous studies devoted explicitly 
to the problem of MMPI profile validity have 


E 4 


Simulated Profiles 





(1,773) (319) 
Proportion Proportion Proportion Proportion 

Called Called Called Called Phi 

F—K scores Authentic Simulated Authentic Simulated Coefficients 
@ and over 85.3 14.7 11.6 88.4 .597 
‘ar er $8.0 12.0 11.6 88.4 637 
2” 89.3 10.7 13.8 86.2 644 
3” 90.9 9.1 15.0 85.0 .674 
4” 92.4 7.6 16.3 83.7 694 
5” ” 93.5 6.5 16.9 83.1 713 
6” ” 94.5 5.5 18.8 $1.2 724 
a i 95.2 4.8 21.3 78.7 723 
gs” ” 96.0 4.0 23.5 76.5 -728 
9 ” ” 97.0 3.0 25.4 74.6 -742 
10 ” ” 97.4 2.6 28.2 71.8 -736 
1” ” 97.7 2.3 29.8 70.2 736 
12” ” 98.1 1.9 32. 67.7 .729 
wf” 98.3 1.7 35.7 64.3 710 
4” ” 98.6 1.4 36.7 63.3 718 
is |” ” 98.8 1.2 39.2 60.8 .706 
“6” ” 98.9 1.1 41.7 58.3 690 
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shown that all of the validating scales, but es- 
pecially a combination of the F raw score 
minus the K raw score, have practical utility. 

The F—K index has been demonstrated to 
detect “fake bad” profiles quite readily, but 
has been less efficient in detecting cases of posi- 
tive dissimulation. 

A consideration of a large number of normal 
and clinical cases suggests that the sampling 
distribution of F—K _ is reasonably normal, 
and that this index is not distorted by psychi- 
atric abnormality as such. Both of these proper- 
ties strongly recommend it as a screening de- 
vice for profile validity. 





A table of phi coefficients for various cutting 
scores of F—K was provided which can be used 
by MMPI workers according to their own 
needs. The cutting score for the “faking bad” 
profiles which maximized the fourfold point 
correlation was plus nine. 





Receive 1 December ], 1949. 
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A STUDY OF TWO PERSONALITY 
QUESTIONNAIRES’ 


WILLIAM CANNING 


ST. LOUIS UNIVERSITY 


GEORGE HARLOW 
CARLETON COLLEGE 
AND 
CLINTON REGELIN 


PALOS VERDES COLLEGE 


HE purpose of this study is to determine ality Inventory: schizophrenia, paranoia, psy- 

if there exists a relationship between chopathic deviate, hypomania, and depression. 

several components of the Humm-Wads- 
worth Temperament Scale and The Minnesota 
Multiphasic Personality Inventory. The gen- 
eral hypothesis is that there are five areas which, 
by description, seem common to these tests. In 
the Humm-Wadsworth, the areas of autistic, 
paranoid, hysteroid, manic, and depressive 
were compared respectively with the following 
scales of the Minnesota Multiphasic Person- 


Table 1 shows a summary of data. 

Both of these tests have been machine scored. 
The investigators devised stencils to be used 
with an IBM scorer for the Humm-Wads- 
worth and used the standard methods for scor- 
ing the Minnesota Multiphasic Personality 
Inventory. 

A total of 128 students were chosen from 
students beginning their first education course 
and students who were completing their course 


'The writers wish to thank Dr. A. R. Gilliland jn education (student teachers). Eighty-four 


of Northwestern University for his aid in the use 


of the Humm-Wadsworth Temperament Scale. students were taken from the beginning course 
TABLE 1 
MATCHED SCALES OF THE Two TeEsTs AS BASED ON THE At'rTHORS’ DEFINITIONS OF TRAITS 


Humm-W adsworth 


M.M.P.I. 


Autistic-Schizophrenia 
Evidence of imaginativeness, retiring disposition Bizarre and unusual thoughts or behavior, splitting 
and tendency to flinch from social situations. of subjective life from reality. 

Paranoid-Paranoia 
Evidence of tenacity of opinion and defense of Suspiciousness, over-sensitivity, delusions of persecu- 
systematized ideas. tion. 

Hysteroid-Psychopathic Deviate 

Evidence of self-concern, self-interest and posses- Absence of deep emotional response, inability to 
siveness. profit from experience and disregard of social mores. 

Manic-Hypomania 
Evidence of cheerfulness, drive toward activity, Undertaking too many things, active, enthusiastic, 
alertness and excitability. disregards social conventions, stirs things up and 

then loses interest. 
Depressive-Depression 

Evidence of depression, retardation and indeci- Poor morale of the emotional type, feeling of use- 


sion. lessness and inability to assume a normal optimism 


with regard to the future. 
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in education, and forty-eight were student 
teachers. The student teachers took the Humm- 
Wadsworth test first; the procedure was re- 
versed with the beginning students. The tests 
were administered according to the directions 
of the test constructors. A minor aspect, which 
may be noted here, was that the Minnesota 
Multiphasic Personality Inventory and the 
Humm-Wadsworth Temperament Scale 
showed no general overall differences between 
the means of the beginning students in educa- 
tion and those completing their course of study. 

A correlation study was made on the five 
apparently comparable components of the 
Humm-Wadsworth and the Minnesota Multi- 
phasic Personality Inventory tests. Correla- 


tions between the respective components may 
be noted in Table 2. 


TABLE 2 
CORRELATION BETWEEN M.M.P.I. AND HUMM- 
WADSWORTH TEMPERAMENT SCALE 


Humm-Wadsworth M.M.P.I. 





Correlation 
1. Autistic Schizophrenia —.02 
2. Paranoid Paranoia —.05 
3. Hysteroid Psychopathic 
Deviate 10 
4. Manic Hypomania .20 
5. Depressive Depression .07 


The correlations thus listed lead to the con- 
clusion that one of the tests is not measuring 
the components as defined, or that both are 
measuring different aspects of the same com- 
ponent. 


SUMMARY AND CONCLUSION 


The Minnesota Multiphasic Personality In- 
ventory (group form) and the Humm-Wads- 
worth Personality Questionnaire were used as 
measures of personality traits. The reason for 
employing these two measures was to verify 
or disprove the hypothesis that these two inven- 
tories measure five similar traits of personality. 
The low correlations obtained between the two 
questionnaires led to the conclusion that either 
one of the tests is not measuring the compon- 
ents of personality, or that the tests are meas- 
uring different aspects of the same component. 


Received November 18, 1949. 
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Britt, Steuart H. (Ed.) Selected readings in so- 
cial psychology. New York: Rinehart, 1950. 
Pp. xvi + 507. $2.00. 


Fifty well-edited extracts from writings of psy- 
chologists, sociologists and anthropologists make up 
this paper-bound volume of supplementary material 
for the undergraduate course in social psychology. 
Although the topics follow the order of the author’s 
Social psychology of modern life, a table correlates 
the contents with ten other textbooks. 


Dorcus, Roy M., AND Jones, MARGARET HuBBARD. 
Handbook of employee selection. New York: 


McGraw-Hill, 1950. Pp. xv 349. $4.50. 


This unusual book consists of uniformly arranged 
abstracts of 426 research studies on the selection of 
workers for specific The authors searched 
over 300 different periodicals, foreign as well 
American, and found that most references had to 
be rejected for insufficiency of data. 


jobs. 


as 


Job and test 
indexes permit quick reference to studies relating 
to particular occupations, or in which specific tests 
were used. Although the scope of the volume may 
not justify calling it a “handbook,” it will prove a 
convenient reference 


source for workers, teachers 


and students in the field of vocational psychology. 


FREEMAN, GRAYDON L., 
How to pick leaders. 
nalls, 1950. Pp. vii + 


AND TAYLOR, Erwin K. 
New York: Funk & Wag- 


226. $3.50. 


A lively popular presentation, addressed to busi- 
ness executives. In the main its reports of research 
findings are sound, although presented without ref- 
erence to the experimental designs and limitations 
involved. Occasional references to the concept of 
“energy direction control,” of which the senior au- 
thor has written elsewhere, are insufficiently clear 
to be of functional value. The book may persuade 
some business men to abandon old and useless sys- 
tems of selection, but whether it will make them 
seek the help of competent industrial psychological 
consultants remains to be seen. 








Note: Some reviews in this issue were pre- 
pared by the Associate Editors, who may be identi- 
fied by their initials. Unsigned reviews are by the 
Editor.—L. F. S. 
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HAMILTON, KENNETH W. 


hand- 
icapped in the rehabilitation process. New York: 


Counseling the 


Ronald Press, 1950. Pp. vi + 296. $3.50. 


An informative book about rehabilitation, mainly 
from the viewpoint of social workers. The philos- 
ophy and economics of rehabilitation, the processes 
of appraisal, guidance, training and placement, 
and the mobilization of community resources, are 
covered The psychologist in the rehabilita- 
tion process is seen only as a psychometrician, and 
the section on testing seems the weakest part of 
the book. There is adequate recognition of the 
personal adjustment in rehabilitation, its 
being assigned 
workers.” 


1 
well. 


factor of 
treatment 


trained 


to “psychiatrists and 


social 


HAMRIN, SHirLey A., AND PAULSON, BLANCHE B 


>. 


Counseling adolescents. Chicago: Science Re- 
search Associates, 1950. Pp. x 371. $3.50. 
The authors have given us a well-planned book 


fer the preparation of those who are being trained 
to counsel adolescents. They frankly state that they 
“wish to give primary emphasis to those counseling 
methods which can be classed as explanatory, in- 
formative, and educative in character.” They have 


presented the leading theories on counseling and 


are especially interested in showing the student 
how these theories can be fitted into actual prac- 
tice. The book keeps the practical aspects con- 
stantly in mind, with many examples of case ma- 


terial and actual interviews. It should prove of 
not only as a textbook for those training for 
counseling, but will serve as a refresher to teach- 


ers, group leaders, social workers 


value 


and others whose 


work brings them in contact with teen-agers.— 

B. M. L. 

James, Witu1aM. The principles of psychology. 
Reprint edition. New York: Dover Publications, 
Inc., (1890), 1950. Two volumes bound as one. 
Pp. viii 689 and vi + 702. $7.50. 

With the reprinting of James’s Principles just 


sixty years after its original publication, it cannot 
be doubted that American psychology has a classic. 
Rereading James brings a sense of perspective and 
even a little of humility to our regard for more 
modern 


, a . . 
achievements. His is a dynamic psychol- 


ogy, interested in real and striving people as 
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shown in the chapters on instinct, the stream of 
thought, habit, emotion, and will. The abnormal- 
ities of behavior always interest him, and occupy 
a conspicuous place in his plan. Many problems 
of recent research are forecast by James. For ex- 
ample, lest we think that the concept of the rela- 
tion of need to perception was invented only yes- 
terday, we should reread his sentence, “To the broody 
hen the notion would probably seem monstrous 
that there should be a creature in the world to 
whom a nestful of eggs was not the utterly fasci- 
nating and precious and never-to-be-too-much-sat- 
upon object which it is to her.” And no other 
psychologist in the sixty years has written quite so 
well. The principles is a grand old book, but it 
is a book with a future, too, to be remembered 
when most or all of 1950’s crop has been for- 
gotten. 


Kent, Grace H. Mental tests in clinics for chil- 
dren. New York: Van Nostrand, 1950. Pp. 
xii + 180. $2.45. 


This is a richly self-revealing book that no read- 
er can complete without a feeling that he has be- 
come quite well acquainted with Grace H. Kent. 
The author has attempted to distill her four dec- 
ades of clinical experience in the examination of 
children. She is at her best while describing what 
to observe, how to write a report, and how to use 
clinical resources to study people instead of test 
scores. Her greatest shortcoming seems to be a con- 
fused attitude toward test standardization, which 
is rejected because of the rigid interpretations of 
test scores that untrained workers often make, 
without adequate recognition of the values that 
standardization has for even the most broadly 
trained psychologists. Many of the clinical insights 
of the book are rewarding, and it contributes to a 
broadening of one‘s historical perspectives. 


Kein, Henriette R., Potter, Howarp W., AND 
Dyk, Rutu B. Anxiety in pregnancy and child- 
birth, A Psychosomatic Medicine Monograph. 
New York: Paul B. Hoeber, 1950. Pp. ix + 111. 
2.75. 

A psychiatric research on the reactions to preg- 
nancy and childbirth of 27 primaparous women. 
The data and findings, stated mainly in descriptive 
terms, show the extent, nature and course of the 
anxieties aroused, and their relationships to pre- 
existing personality factors. A limitation of the 
study arises from the social status of the sample 
studied, who were clinic patients. Educational, 
occupational and other data show clearly that they 
were mainly drawn from a “lower-lower” cultural 
group. Their superstitions and misconceptions are 
almost surely related to that cultural status, and 
their anxieties quite probably so. Although the 
authors show awareness of the special nature of 
the sample, they do not refrain from drawing gen- 


eral conclusions. The appended case histories of 
the subjects provide useful material illustrative of 
a variety of reactions to a situation that may evoke 
anxiety. 


LANDIS, CARNEY, AND Bo.Lies, M. Marjorie. TJ ext- 
book of abnormal psychology (Rev. Ed.) New 
York: Macmillan, 1950. Pp. x + 634. $5.00. 


The revision of this text after only four years 
gives evidence of the scope of new research find- 
ings in abnormal psychology. In the main the 
changes consist of adding recent research, but they 
have been well integrated, not tacked on as after- 
thoughts. 
topics: on psychopathic personality, on heredity, 
and on psychosurgery including a rather full report 
of the Columbia-Greystone project. The valuable 
chapter on mental hygiene has been enlarged. In 
its revision the book remains one of the clearest 
elementary texts in its field, 


There is new material on numerous 


Pottak, Otro. The criminality of women. Phila- 
delphia: Univ. of Pennsylvania Press, 1950. Pp. 


xxi + 180. $3.50. 


A study of some of the characteristics of female 
crime. Among the aspects discussed are the meth- 
ods of crime commission, the extent and specificity 
of female crime, the personal characteristics of the 
offender, an analysis of female crimes against per- 
sons and property, and the influential biological 
and social factors. The author shows that the ex- 
tent of female crime has been underestimated be- 
cause of the protective attitude toward women. 
Women are, however, exposed to many irritations 
and temptations which may lead them to criminal 
behavior. Except prostitution, no specifically fe- 
male crime is found. Only in the cultural deter- 
minants do we find behavior which can be called 
female. Examples are the subtle weapons used, 
the victims selected (children and intimates 
the relatively passive role played in the perpetra- 
tion of the crime.—F. McK. 


, and 


Porteus, STANLEY D. The Porteus Maze Test and 
intelligence. Palo Alto, Calif.: Pacific Books, 
1950. Pp. vii + 194. $4.00. 


As the author says in his Preface, “1914 seems a 
very long time ago” when the Porteus Maze Tests 
were first devised. Since that time, however, they 
have been the subject of continued research, much 
of it in the past decade. Although there were books 
about the maze test in 1919, 1924, and 1933, the 
author has no need to apologize for still another. 
Porteus reports the origin and development of the 
test, and the results of studies in many areas: 
feeblemindedness, industrial applications, racial! 
differences, delinquency, brain surgery, and organic 
brain defects. The data are stated clearly and 
fairly, but the comments often depart from cold 
science toward a personal and autobiographical 
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flavor with more than a trace of contentiousness. A 
complete manual for the administration and scor- 
ing of the test, and for quantitative and qualita- 
tive interpretation, is included. Regret must be 
expressed that, in determining the age-score ratio or 
test quotient, Porteus continues to tinker with the 
chronological age in order to adjust to variations 
in test difficulty at different age levels. It would 
have been far more rational to provide a conver- 
sion table to turn test ages into true performance ages. 
Small shortcomings, however, do not conceal the 
value of the maze test. It is one of the great origi- 
nal contributions to psychometrics of the first half 
of the century. The present volume replaces all 
others as a sufficient manual for the test, which 
should continue in wide use. 


SARGENT, S. STANSFELD. Social psychology. New 
York: Ronald Press, 1950. Pp. x + 519. $4.50. 


An eclectic text in social psychology, written 
with the explicit objective of integrating the per- 
tinent findings of psychology, sociology, anthropol- 
ogy and the other social sciences. The book is 
systematically organized, and maintains both a 
dynamic viewpoint and an emphasis on research 
findings. Its thorough documentation adds to its 
value as a reference volume for clinical psycholo- 
gists whose major interest is not in social psy- 
chology, but who want a guide to primary sources 
in the areas where the clinical and social fields 
interact. 


Stavson, S. R. Analytic group psychotherapy. New 

York: Columbia Univ. Press, 1950. Pp. vii + 
275. $3.50. 

Slavson now divides group psychotherapy into 
two main types: activity group therapy and ana- 
lytic group psychotherapy. The latter constitutes 
the subject matter of this volume, and is further 
divided into three subgroups: play group psycho- 
therapy (for preschool age); activity-interviews 
group psychotherapy (for young school children) ; 
and interview group psychotherapy (for adoles- 
cents and adults). These are described in detail, 
their dynamics elaborated, and illustrative case 
material presented freely. Slavson considers these 
types of group therapy “fundamentally similar in- 
safar as they are all based on transference, cathar- 
sis, interpretation and insight, and have their 
foundations in psychoanalysis.” The methodology 
employed in the various groups differs consider- 
ably, however, and much of the volume is devoted 
to these differences. One chapter, allotted to a 
discussion of the selection and grouping of patients, 
supplies a much needed rationale for the organi- 
zation of groups for therapy. Another chapter on 
group treatment of psychotics is rather inadequate, 
and adds little to a discussion of the literature on 
the subject which consumes the bulk of the space 
in the chapter. The sections of the book which are 


based on case material are the most valuable. 
Analytic group psychotherapy will probably be- 
come a standard work for non-nondirective thera- 
pists.—M. K. 


SULLLIVAN, ALBertT J., AND McKeit, THomas E. 
Personality in peptic ulcer. Springfield, LIL: 
Charles C Thomas, 1950. Pp. x + 100. $3.00. 


Two gastroenterologists collaborated in this mono- 
graph, which advances a multiple-causation theory 
of peptic ulcer and advocates multiple treatment. 
Among the determining factors of constitution, per- 
sonality, precipitating situation, physiological trau- 
ma, unique factors, and resistance, they devote de- 
tailed attention only to the second and third. The 
ulcer personality is described, with the aid of clever 
little illustrations, as driving, versatile, active, rest- 
less, responsible, determined, conscientious and over- 
extended. A majority of patients show this person- 
ality pattern which does not yield readily to psy- 
chotherapy; a minority are neurotic and do need 
psychotherapy; another minority shows no _ per- 
sonality deviation but has ulcer because of a com- 
bination of the other factors. There are eighteen 
case studies illustrating these types. 


THOoMpPsoN, CLARA. Psychoanalysis: evolution and 
development. New York: Hermitage House, 1950. 
Pp. xii + 252. $3.00. 


This book has grown out of a course of lectures 
which Dr. Thompson has been giving at the 
Washington School of Psychiatry in Washington, 
D. C. and the William Alanson White Institute of 
Psychiatry in New York. While admitting to a 
personal predilection for the “cultural interper- 
sonal school,” the author has attempted an objec- 
tive eclectic survey of the development of psycho- 
analysis which takes into account not only develop- 
ments in “orthodox” psychoanalysis but also the 
numerous deviant trends. As Dr. Thompson says, psy- 
choanalysis “offers a more effective and comprehen- 
sive method of therapy today than Freud could have 
possibly envisioned.” The author has been emi- 
nently successful in attaining her goal. The result 
is a book that is well written, stimulating, and a 
genuine contribution to the integration of the field. 
It can be recommended for class reading and for 
private professional perusal. This reviewer would 
consider it a must item from current publishers’ 
lists. —W. A. H. 


Trow, Witt1AMm Cvuark. Educational psychology. 
(2nd Ed.) Boston: Houghton Mifflin, 1950. Pp. 
ix + 761. $4.00. 


Trow conceives educational psychology as an 
applied science, drawing from all pertinent sources 
to contribute to an understanding of children as 
whole persons: their develapment, adjustment, 
learning and socialization. The earlier edition 
(1931) pioneered in emphasizing the topics of ad- 
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justment and guidance as important in the prepa- 
ration of teachers. That emphasis continues, and 
the material is still good, but it is too much a 
repetition of what was written nineteen years ago. 
Insufficient recognition is given to newer view- 
points and research relating to conflict, anxiety, 
personality and allied concepts. The rest of the 
revision fares somewhat better. There is a new 
chapter on perception, and increased coverage of 
the social psychology of education. 


TEsTs 


Bett, HucH M. Personal Preference Inventory. 
College students. 1 form. Untimed (15) min. 
Questionnaire blank ($2.00 per 25), with key, 
manual, pp. 5; specimen set (15¢). Palo Alto, 
Calif.: Pacific Books, 1947, 1950. 


A questionnaire of 90 items, 30 in each of three 
areas, that is intended to evaluate maladjustment 
with respect to economic background (feeling of 
lacks at home), social attitude (critical attitudes 
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toward others), and masculinity-femininity. The 
manual recommends its use in counseling college 
students only when full cooperation is given in 
responding. Corrected odd-even reliabilities of the 
three scores are given as .80, .85, and .84, respec- 
tively. The masculinity-femininity scale separates 
men and women validly; validation of the other 
two scales appears to be mainly subjective. Tenta- 
tive norms are based on 186 college men and 144 
college women. 
{ 
Bennett, Georce K. (Ed.) The Psychological 
Corporation General Clerical Test. Revised 


manual, pp. 16. New York: Psychological Corp., 
1950. 


The revised manual for the Psychological Cor- 
poration’s “GCT” gives extensive norms for the 
test, based on 4829 applicants for clerical jobs, 
veterans at guidance centers, girls in schools, and 
job applicants of 16 large businesses. Norms are 
expressed in percentiles for the clerical, numerical, 
verbal and total scores of the test. 

















PSYCHOLOGICAL MONOGRAPHS: GENERAL 
AND APPLIED 


Volume 64, 1950 


Patterns of Personality Rigidity and Some of Their Determinants. Seymour Fisher, 
Elgin State Hospital, $307, $1.00 


The Value of an Oral Reading Test for Diagnosis of the Reading Difficulties of 
College Freshmen of Low Academic Performance. Charles A. Wells, American 
International College. $ 308, $1.00 



























Rorschach Responses Related to Vocational Interests and Job Satisfaction. Solis 
L. Kates, Michigan State College. $309, $1.00 








Symbol Elaboration Test (S.E.T.): The Reliability and Validity of a New Projec- 
tive Technique, Jokanna Krout, Chicago Fsychological Institute. $310, $2.00 


Changes in Responses to the Minnesota Multiphasic Inventory Following Certain 
Therapies, William Schofield, University of Minnesota. $311, $1.00 


A Scale for Measaring Teacher-Pupil Attitudes and Teacher-Pupil Rapport. Carroll 
H. Leeds, Furman University. $312, $1.00 


The Nature and Efficacy of Methods of Attack on Reasoning Problems. Benjamin 
Burack, Roosevelt College. $313, $1.00 


The Validity of a Multiple-Choice Projective Test in Psychopathological Screening. 
Mariin Singer, VA Hospital, Long Island. $314, $1.00 


A Normative Study of the Thematic Apperception Test. Leonard D. Eron, Yale 
University. 315, $1.50 


Experimentally Induced Variations in Rorschach Performance. Edith E. Lord, 
Arizona State Department of Health. ¢316, $1.00 


An Evaluation of Personality-Trait Ratings Obtained by Unstructured Assessment 
'nterviews, Ernest C. Tupes, United States Air Force. $317, $1.00 






The 1950 volume of the Psychological Monographs will consist of eleven sepa- 
rate issues, Orders can be paced for any of these; the orders will be filled when 


the issues appear. On July 6, 1950, $307 and #308 had been published. 
The entire 1950 volume may be subscribed to for $6.00; it will probably be 
completed by December of 1950. 


American Psychological Association 


1515 Massachusetts Avenue N.W., Washington 5, D.C. 
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